Domain: peacefire.org
Stories and comments across the archive that link to peacefire.org.
Stories · 223
-
Bill Gates Should Buy Your Buffer Overruns
Slashdot regular Bennett Haselton has written in with his latest essay. He starts "WabiSabiLabi generated some controversy recently by announcing their eBay-like site for security researchers to sell security exploits to the highest bidder. But WabiSabiLabi didn't create the black-and-grey market for security exploits, they merely helped draw attention to it. There's nothing that companies like Microsoft can do about the black market where security exploits sell for tens of thousands of dollars, but there's one obvious thing they can do to help protect users: offer to buy up the security vulnerabilities themselves. If they did that, then the exploits would probably never make it onto a black-market auction in the first place, because the "white hat" researchers would have found them and reported them first. Thus I think WabiSabiLabi is doing the world a favor, by shining a spotlight on the black market that thrives when companies won't pay for security bug reports." Click that magical little read more link below to continue the thought.Really, what is a good argument against companies paying for security exploits? It's virtually certain that if a company like Microsoft offered $1,000 for a new IE exploit, someone would find at least one and report it to them. So the question facing Microsoft when they choose whether to make that offer, is: Would they rather have the $1,000, or the exploit? What responsible company could possibly choose "the $1,000"? Especially considering that if they don't offer the prize, and as a result that particular exploit doesn't get found by a white-hat researcher, someone else will probably find it and sell it on the black market instead? (Throughout this discussion, I'm using Microsoft as a metaphor for all companies which have products in widespread use, and which do not currently pay for security exploits even though they could obviously afford to.)
Perhaps you say that you would be willing to report bugs to Microsoft for free, and I respect people who do that out of selflessness, but that's not the point. Even if you and some other people would do "white-hat testing" for free, there are more people who would do it if there were prizes. The amount of people willing to do security testing for free, has not been enough to keep exploits from being found and sold on the black market -- but if Microsoft offered enough money, it would be. Obviously if Microsoft offered more than the black-market prices, everyone would just sell their exploits to them. But probably Microsoft could offer much less than the black-market prices and still put the black market out of business, because there are lots of researchers who wouldn't sell exploits on the black market even for tens of thousands of dollars, but would be willing to participate in a legal Microsoft "white hat" program for much less money.
Microsoft would undoubtedly say that they do their own in-house testing, and indeed the offer of a prize should not be used as a substitute for good security testing within a company. But at the same time, the fact that a company does their own testing isn't a good reason for not offering a prize. If a company says that they already do their own in-house security audits to catch as many bugs as they can, that still doesn't answer the question: given that a cash offer would probably result in an outsider finding a new exploit that they missed, why wouldn't they want to take it? Even if there are already outsiders who willingly find new exploits and turn them over to Microsoft for free, there's almost certainly at least one more exploit out there that would be found if they offered a cash prize. (And if the cash prize doesn't turn up any new exploits, then the company doesn't pay out and has lost nothing.)
I've done security consulting for companies like Google and Macromedia who paid me "by the bug", so you might think I'm biased in favor of more such "bounty" programs because I think I could make money off of them. Actually, I think that if Microsoft and most other large software companies offered security hole bounties to everyone in the world, almost all exploits would be picked clean by other people, and my chances of getting anything out of it would go way down, and there would be one less buffer protecting me from having to get a real job. But most people's computers would be safer.
Microsoft does in fact "pay" for security exploits in their own way, by crediting people in their security bulletins. To some people, who report exploits in hopes of being recognized, this is apparently enough. And there are third-party companies like iDefense who will buy your security exploits and then use them to gain reputation-credits for themselves, by handing them over for free to the software developer and warning their own clients about the potential risks. But there are a lot of people including me who have found exploits in the past, but don't consider the benefits of being mentioned in a Microsoft security bulletin to be worth the effort of finding a new one. And even the benefits that iDefense gets from reporting security holes, are evidently not sufficient for them to offer enough money for exploits to compete with the black-market prices (if iDefense got that much benefit out of it, then they'd be able to offer so much money that nobody would sell exploits on the black market). So using recognition as payment is evidently not enough; as Lord Beckett says, "Loyalty is no longer the currency of the realm; I'm afraid currency is the currency of the realm."
A cash prize program might mean that some people get mad when they are turned away for offering "exploits" that don't really qualify, but so what? What are they going to do for revenge, release their "exploit" into the wild? If it's not a real exploit, then it won't do any harm, and if it is a real exploit, then Microsoft should have paid them after all! Some people might threaten to sue if they aren't awarded prizes, even if the rules of the program state clearly that Microsoft is the final arbiter of what counts as an exploit. Maybe in some rare cases they would even win. But all of this could be considered a cost of running the program, just like the cost of giving out the prizes themselves -- and all insignificant compared to the cost of an exploit that gets released into the wild and allows a malicious site to do "drive-by installs" of spyware onto people's machines.
Probably the real reason Microsoft doesn't pay for security exploits is that they don't pay the full price for those drive-by installs and other problems when a new exploit is discovered. I've heard hard-core open-source advocates say that either (a) Microsoft should be held liable for the cost of exploits committed using flaws in their software, or that (b) users of Microsoft software should be held liable for exploits committed through their machines (which would drive up the cost of using Windows and IE to the point where nobody would use it). If that happened, Microsoft probably would pay for security exploits to forestall disaster. But let's make the reasonable assumption that neither of those liability rules is going to come to pass. The real price that Microsoft currently pays for security exploits is in terms of reputation, and the price they're paying right now is too low, because people don't realize that Microsoft could find and fix a lot more bugs by spending only a tiny amount of money -- but chooses not to. Despite all the snickering when "Microsoft" and "security" are used in the same sentence, most people seem to believe that Microsoft is doing everything they can to prevent users from being exploited. But as long as Microsoft doesn't pay for security holes, they're emphatically not doing "everything they can".
It's not that I think security bosses at Microsoft are trying to screw anyone over. They probably just have an aversion to the idea of paying for security holes, and what I'm arguing is that such an aversion is irrational. The people they would be paying money to are not criminals or bad people, they're legitimate researchers who just can't afford to do work for Microsoft for free when they could be doing something else for money. Offering cash will bring in new exploits, and every exploit that is reported and fixed is one that can't be sold on the black market later.
There are some interesting details that would have to be worked out about how such a program would be implemented. For example, what happens if Bob reports an exploit, and then Alice later reports the same exploit, before Microsoft has gotten a chance to push the patch out? Microsoft wouldn't want to pay $1,000 to both of them, because then whenever Bob found an exploit, he could collude with Alice so that they both "independently" reported the same bug and got paid twice. Microsoft could pay only Bob, but Alice could get so disillusioned at getting paid nothing that she might stop helping entirely. My own suggestion would be to split the money between all researchers who report the same bug in the time window before the fix is pushed out. If 10 researchers happened to report the same bug and each only got a paltry $100, some of them would quit in disgust, but if researchers start to leave because the average payout-per-person has fallen too low, then that will drive the average payout back up, so the number of active researchers stays in equilibrium.
Another issue: What happens if a researcher reports an exploit confidentially, and then the next day, the exploit appears in the wild? If Microsoft's policy was that they would pay for the exploit anyway, then a researcher would have no incentive not to sell the exploit twice, once to Microsoft and again on the black market (whereupon it might start being used in the wild). On the other hand, if Microsoft refused to pay for exploits that were released in the wild before they issued a patch, then that might leave many researchers feeling cheated if they turned in a genuine exploit and got nothing just because someone else sold it on the black market before the patch came out. My suggestion would be to simply pay for exploits even if they did subsequently get released on the black market -- on the theory that of the white hat researchers who turn in bugs to Microsoft, most of them would be ethically opposed to selling exploits to black marketeers, so they shouldn't be punished if the exploit ends up on the black market since they probably weren't the ones who put it there. Another would be to make the payout so large that even if researchers got no payment when the exploit got leaked into the wild before a patch was issued, the payout from the times that they did get paid, would more than make up for it.
But whatever rules are decided upon, there should be some sort of monetary rewards for people who confidentially report security flaws to big software companies. Whatever you can say about the merits of rewarding people through "recognition", or through social pressures to practice "responsible disclosure", the one obvious fact is that it hasn't been enough -- exploits still get sold on the black market, and every exploit that gets sold on the black market, would have been reported to Microsoft if they'd offered enough money. The talent is out there that could find these bugs and get them fixed. Most of them just can't afford to donate the work for free -- but the amount of money Microsoft would have to pay them, is far less than the benefits that would accrue to people all over the world in terms of fewer drive-by spyware installs, fewer viruses, and fewer security breaches. And if these benefits were reflected back at Microsoft in terms of greater user confidence and fewer snide jokes about "Microsoft security", then everybody would win all around. There are no barriers to making this happen, except for a mindset that it's "bad" to pay for security research. But if you prevent millions of Internet Explorer users from being infected with spyware, you deserve to at least get paid what Bill Gates earns in the time it took you to read this sentence.
-
Hotmail vs Goodmail
Frequent Slashdot Contributor Bennett Haselton wrote in with his latest column. He says "Are we being too hard on Goodmail for their plans to charge senders a quarter-penny per message to bypass companies' spam filters? Hardly anyone has mentioned that Microsoft has been doing the same thing for years, only (surprise!) charging more. Hotmail lets senders pay a $1,400 "fee" to help get through their spam filter; when I wrote to them about my newsletter being blocked as spam, they said they knew it wasn't spam, but they told me several times they would not even talk about unblocking it unless I paid the $1,400. It's odd that so little attention has been paid to Hotmail's program, since it not only mirrors the Goodmail situation, it validates Goodmail's critics who have said that once you start charging to bypass spam filters, the next step is the marginalization of people who won't pay." Read on for more words.As you hear words like "Hotmail" and "AOL", you may be tempted to think this doesn't affect you if you've outgrown those companies, but I think that's a mistake. First of all, if you think you might ever run a business that publishes an e-mail newsletter, you'll have to worry that your mail might be blocked unless you pay to unblock it. Second, even if you're only a subscriber to a company's newsletter and you're not worried about filters on your e-mail address, the company publishing the newsletter has to spend time and resources getting their mails unblocked that they send to other people, time that could be otherwise spent improving their services. Third, even if you're not on the Internet at all, in a real sense it affects the kind of world we all live in, if the wealthy are able to communicate with their listeners more easily than everyone else (that gap has always existed, but the Internet narrowed it, and then unblocking-mail fees widened it a little). If the Republican National Committee can get their mail out and MoveOn.org can't, then that could influence elections, and could affect your life even if you're an Iraqi peasant goat farmer who hasn't updated his blog in weeks. And of course what Microsoft and AOL do, sets a precedent for what other companies can get away with -- so every anecdote about boneheaded mail filtering that you hear about, is potentially significant if it could become the norm.
I wasn't thinking about this when I wrote to Hotmail in 2006 about their users missing our e-mails because of the filter blocking them as "spam", as I jumped through some hoops before talking to a human. But the mentality of the people that I talked to seemed to be that "non-paying sender" and "spammer" were more or less equivalent. I explained that we only send mail to people who request it, we verify all new subscriptions, and every message contains an unsubscribe link. Hotmail replied, "The filters are there for the protection of hotmail subscribers. The Junk Mail Reporting program isn't in place to help you circumvent those filters... I recommend you do what you can on your end to educate your subscribers, keep your mailing lists up to date and follow the other guidelines for senders on the postmaster.msn.com site and don't expect our junkmail filters to be modified." Call me a dreamer, but I thought the whole point of having humans in the loop was that if the filter is making a mistake, you can modify it.
(Many people have suggested that I publish via RSS instead of e-mail. For me the problem with that is that our newsletter is used to send out the location of new sites for getting around blocking software, so that by the time the last sites have gotten blocked in most places, the new ones are being mailed out. As long as people can access their e-mail accounts, they can get the new site announcements. But if we used an RSS feed instead of e-mail, then blocking software companies would just block our RSS feed. And besides, even a normal newsletter publisher would lose most of their existing subscribers if they told everybody that they had to switch over to RSS to receive the newsletter in the future. Is it right that they should have to pay that penalty just because an ISP is falsely labeling their mail as spam?)
The $1,400 "fee" that you pay to help get your mail unblocked at Hotmail's servers, is to a third-party company called Sender Score Certified, formerly known as Bonded Sender, whose certifications are used by Hotmail. I didn't think I could get anywhere discussing with them the ethics of charging people to unblock their mail as spam, so instead I asked them, what would happen if someone forked over the cash and then their enemies started filing phony "spam" complaints against them, hoping to get their certification revoked? I think this is an important question for any spam policing system, but unfortunately it usually puts people on the defensive, because there's no real answer -- if you accept spam complaints, then you allow crackpots to do damage, and if you don't accept spam complaints, how do you know if a client is spamming? Bonded Sender's rep replied, "Do you really have that many enemies? If you are running a true 'non-profit', who is that mad at you? Maybe finding this out should be a little higher on the agenda. Where is the 'peace' in Peace Fire?" I asked the same question again, and eventually he said that complaints were based on SpamCop complaints -- a system known for being set up so that anyone could report anyone as a "spammer" without proof -- and that each such complaint would cause $20 to be depleted from your bond, and once it was all gone, you'd lose your certification.
"After reading all of your emails you have sent me," he continued, "it seems that you aren't really trying to find a solution to anything. You are mainly interested in pointing out flaws in programs and letting me know about how people don't like you." Actually I don't think I have enough enemies to cause me serious problems, but I'm working on it! I aspire someday to reach the level of notoriety achieved by groups like MoveOn.org, who does have enough enemies that if systems like Hotmail's were widely deployed, MoveOn would have to worry about militants falsely reporting their mails as spam in order to cost them money and/or get them blacklisted. That's the other basic problem with certification systems: they don't just favor the wealthy, they also favor the non-controversial. Do we really want an Internet where everyone has to be careful about who they offend, because anyone could get them listed as a spammer? I mean, that would be like having a free online encyclopedia where anyone could edit your bio and say that you killed someone!
Is it legal to block someone's mail as spam until they pay you money? Whoah, before I even use the l-word, I'd better insert a disclaimer. No, not that disclaimer. Nobody could possibly think that I was a lawyer after I filed motions in court with the pages stuck together to prove that judges weren't really reading them, unless I had some kind of career death wish. The disclaimer is that at least from my own experiences suing spammers, the law is whatever the judge wants it to be. Some judges say you can sue spammers out-of-state, and some say you can't. Some of them say you can sue in Small Claims only if you've lost money, and some say you can sue for damages even if you haven't lost anything. Some of them say a non-lawyer is allowed to represent their own corporation in court, and some say no. If judges don't even agree on the basic rules, good luck getting a legal consensus on a more abstract issue. Asking objectively if deliberately blocking non-spam e-mail is "legal" is like asking "Do apples taste good?"
But as a general rule, I think courts take a dim view of systematically publishing false statements about someone to try and get them to pay you off in order to stop. Unless you're a spammer, every time Hotmail labels one of your messages as "Junk Mail", they're publishing something untrue about you (at least to everyone who sees the message labeled as junk), and if you've brought it to their attention, then they may agree the statement is untrue but they go on making it anyway. In libel law, liability is partly determined by how much someone has been harmed by the false statements about them; in the case of mail being blocked as "Junk Mail", the harm is about as direct as possible, since because it was falsely labeled as spam, most users will never see it. This is why I think people who say "Hotmail/AOL/Yahoo can do whatever they want with their private network" are missing the point. If I used my own "private network" to publish a subscription service that people use to find out the names of new convicted felons in their neighborhood so that they can avoid doing business with those people, would you have no objection if I "accidentally" included your name on the list, but promised to review your situation for one low fee of $1,400?
There was a time in the late '90's when if Microsoft had said they were going to be blocking non-partner e-mails as "junk mail" unless senders paid a $1,400 "fee" to get unblocked, Congress would have hauled up Bill Gates and given him a good wedgie and told him to cut it out. But these days the Department of Justice doesn't have time to worry about other people's lost e-mail when they can't even lose their own e-mails properly.
All this happened at about the same time Goodmail was first attracting controversy for charging senders a quarter penny per message to bypass AOL's spam filters. When the EFF registered DearAOL.com to call attention to the issue (now defunct, but the Wayback Machine saved a snapshot), I hopefully registered DearHotmail.com in case any anyone wanted to use that example as well, but nothing ever coalesced around that. Meanwhile, some random mis-fire seems to have cancelled out some other random mis-fire, and Hotmail is apparently no longer blocking my mail, at least until this article gets published.
As far as I can tell, the only reason Hotmail got off scott-free and AOL/Goodmail didn't, was that Hotmail snuck their system in quietly, while AOL and Goodmail announced their partnership with great fanfare, apparently overestimating the extent to which e-mail publishers would greet them as liberators. This doesn't reflect very well on the outrage grapevine, people.
But the lesson took -- when Goodmail recently announced their partnership with four more e-mail providers, Goodmail featured a press release on their own site, but of the four ISPs, Verizon was the only one issued their own press release. Apparently the other three saw what happened with AOL/Hotmail and got the message.
You didn't ask, but my own idea for an anti-spam system would be to follow a protocol such that when you reply to a list server to confirm your subscription, the reply goes to an address like:
list-peacefire-confirm-481534893-sender=bennett=peacefire.org@mailserver.com
When you send that reply from your Hotmail account, Hotmail would see the "sender=bennett=peacefire.org" part of the address you're replying to, and recognize that to mean that you want to receive future messages sent from bennett - at - peacefire.org. So future messages from that address would be weighted not to be blocked as spam for that user. It wouldn't do anything to unblock person-to-person messages that get blocked as spam, but those are not mis-blocked as often as legitimate newsletters are, and this method would give newsletter publishers a way to get whitelisted at the same time that the user confirms their subscription. It wouldn't be perfect, since if the user then unsubscribes from the newsletter, but bennett - at - peacefire.org is a jerk and continues to send them mail, that mail would still get through because the Hotmail filter for that user still "remembers" that they confirmed their subscription, and doesn't know that they unsubscribed. However, the vast majority of nuisance spam comes from people you've never heard of, not from people whose newsletters you signed up for and then continued to send you mail after you unsubbed.
Or, suppose you're Amazon and you send mail to millions of users from orders@amazon.com, but you don't want everyone to have that address whitelisted because then a spammer could use the address "orders@amazon.com" to spam millions of people, hoping it would get through the filter of anyone who's an Amazon customer. So in that case people could confirm by replying to:
list-peacefire-confirm-481534893-sender=orders=amazon.com&senderip=72.21.203.1@mailserver.com
When the user sent their reply to that address, Hotmail would parse out the "sender=orders=amazon.com" part and the "senderip=72.21.203.1" part, and whitelist future mails from that address that come only from that IP.
I like this idea because it treats everyone equally, regardless of wealth or popularity, as long as they confirm subscriptions to their newsletter (which is regarded as good mailing list hygiene anyway). On the other hand, if you prefer filtering systems that work better for people who are rich and never offend anybody, then you'll be pleased to know that those seem to be winning.
-
Hotmail vs Goodmail
Frequent Slashdot Contributor Bennett Haselton wrote in with his latest column. He says "Are we being too hard on Goodmail for their plans to charge senders a quarter-penny per message to bypass companies' spam filters? Hardly anyone has mentioned that Microsoft has been doing the same thing for years, only (surprise!) charging more. Hotmail lets senders pay a $1,400 "fee" to help get through their spam filter; when I wrote to them about my newsletter being blocked as spam, they said they knew it wasn't spam, but they told me several times they would not even talk about unblocking it unless I paid the $1,400. It's odd that so little attention has been paid to Hotmail's program, since it not only mirrors the Goodmail situation, it validates Goodmail's critics who have said that once you start charging to bypass spam filters, the next step is the marginalization of people who won't pay." Read on for more words.As you hear words like "Hotmail" and "AOL", you may be tempted to think this doesn't affect you if you've outgrown those companies, but I think that's a mistake. First of all, if you think you might ever run a business that publishes an e-mail newsletter, you'll have to worry that your mail might be blocked unless you pay to unblock it. Second, even if you're only a subscriber to a company's newsletter and you're not worried about filters on your e-mail address, the company publishing the newsletter has to spend time and resources getting their mails unblocked that they send to other people, time that could be otherwise spent improving their services. Third, even if you're not on the Internet at all, in a real sense it affects the kind of world we all live in, if the wealthy are able to communicate with their listeners more easily than everyone else (that gap has always existed, but the Internet narrowed it, and then unblocking-mail fees widened it a little). If the Republican National Committee can get their mail out and MoveOn.org can't, then that could influence elections, and could affect your life even if you're an Iraqi peasant goat farmer who hasn't updated his blog in weeks. And of course what Microsoft and AOL do, sets a precedent for what other companies can get away with -- so every anecdote about boneheaded mail filtering that you hear about, is potentially significant if it could become the norm.
I wasn't thinking about this when I wrote to Hotmail in 2006 about their users missing our e-mails because of the filter blocking them as "spam", as I jumped through some hoops before talking to a human. But the mentality of the people that I talked to seemed to be that "non-paying sender" and "spammer" were more or less equivalent. I explained that we only send mail to people who request it, we verify all new subscriptions, and every message contains an unsubscribe link. Hotmail replied, "The filters are there for the protection of hotmail subscribers. The Junk Mail Reporting program isn't in place to help you circumvent those filters... I recommend you do what you can on your end to educate your subscribers, keep your mailing lists up to date and follow the other guidelines for senders on the postmaster.msn.com site and don't expect our junkmail filters to be modified." Call me a dreamer, but I thought the whole point of having humans in the loop was that if the filter is making a mistake, you can modify it.
(Many people have suggested that I publish via RSS instead of e-mail. For me the problem with that is that our newsletter is used to send out the location of new sites for getting around blocking software, so that by the time the last sites have gotten blocked in most places, the new ones are being mailed out. As long as people can access their e-mail accounts, they can get the new site announcements. But if we used an RSS feed instead of e-mail, then blocking software companies would just block our RSS feed. And besides, even a normal newsletter publisher would lose most of their existing subscribers if they told everybody that they had to switch over to RSS to receive the newsletter in the future. Is it right that they should have to pay that penalty just because an ISP is falsely labeling their mail as spam?)
The $1,400 "fee" that you pay to help get your mail unblocked at Hotmail's servers, is to a third-party company called Sender Score Certified, formerly known as Bonded Sender, whose certifications are used by Hotmail. I didn't think I could get anywhere discussing with them the ethics of charging people to unblock their mail as spam, so instead I asked them, what would happen if someone forked over the cash and then their enemies started filing phony "spam" complaints against them, hoping to get their certification revoked? I think this is an important question for any spam policing system, but unfortunately it usually puts people on the defensive, because there's no real answer -- if you accept spam complaints, then you allow crackpots to do damage, and if you don't accept spam complaints, how do you know if a client is spamming? Bonded Sender's rep replied, "Do you really have that many enemies? If you are running a true 'non-profit', who is that mad at you? Maybe finding this out should be a little higher on the agenda. Where is the 'peace' in Peace Fire?" I asked the same question again, and eventually he said that complaints were based on SpamCop complaints -- a system known for being set up so that anyone could report anyone as a "spammer" without proof -- and that each such complaint would cause $20 to be depleted from your bond, and once it was all gone, you'd lose your certification.
"After reading all of your emails you have sent me," he continued, "it seems that you aren't really trying to find a solution to anything. You are mainly interested in pointing out flaws in programs and letting me know about how people don't like you." Actually I don't think I have enough enemies to cause me serious problems, but I'm working on it! I aspire someday to reach the level of notoriety achieved by groups like MoveOn.org, who does have enough enemies that if systems like Hotmail's were widely deployed, MoveOn would have to worry about militants falsely reporting their mails as spam in order to cost them money and/or get them blacklisted. That's the other basic problem with certification systems: they don't just favor the wealthy, they also favor the non-controversial. Do we really want an Internet where everyone has to be careful about who they offend, because anyone could get them listed as a spammer? I mean, that would be like having a free online encyclopedia where anyone could edit your bio and say that you killed someone!
Is it legal to block someone's mail as spam until they pay you money? Whoah, before I even use the l-word, I'd better insert a disclaimer. No, not that disclaimer. Nobody could possibly think that I was a lawyer after I filed motions in court with the pages stuck together to prove that judges weren't really reading them, unless I had some kind of career death wish. The disclaimer is that at least from my own experiences suing spammers, the law is whatever the judge wants it to be. Some judges say you can sue spammers out-of-state, and some say you can't. Some of them say you can sue in Small Claims only if you've lost money, and some say you can sue for damages even if you haven't lost anything. Some of them say a non-lawyer is allowed to represent their own corporation in court, and some say no. If judges don't even agree on the basic rules, good luck getting a legal consensus on a more abstract issue. Asking objectively if deliberately blocking non-spam e-mail is "legal" is like asking "Do apples taste good?"
But as a general rule, I think courts take a dim view of systematically publishing false statements about someone to try and get them to pay you off in order to stop. Unless you're a spammer, every time Hotmail labels one of your messages as "Junk Mail", they're publishing something untrue about you (at least to everyone who sees the message labeled as junk), and if you've brought it to their attention, then they may agree the statement is untrue but they go on making it anyway. In libel law, liability is partly determined by how much someone has been harmed by the false statements about them; in the case of mail being blocked as "Junk Mail", the harm is about as direct as possible, since because it was falsely labeled as spam, most users will never see it. This is why I think people who say "Hotmail/AOL/Yahoo can do whatever they want with their private network" are missing the point. If I used my own "private network" to publish a subscription service that people use to find out the names of new convicted felons in their neighborhood so that they can avoid doing business with those people, would you have no objection if I "accidentally" included your name on the list, but promised to review your situation for one low fee of $1,400?
There was a time in the late '90's when if Microsoft had said they were going to be blocking non-partner e-mails as "junk mail" unless senders paid a $1,400 "fee" to get unblocked, Congress would have hauled up Bill Gates and given him a good wedgie and told him to cut it out. But these days the Department of Justice doesn't have time to worry about other people's lost e-mail when they can't even lose their own e-mails properly.
All this happened at about the same time Goodmail was first attracting controversy for charging senders a quarter penny per message to bypass AOL's spam filters. When the EFF registered DearAOL.com to call attention to the issue (now defunct, but the Wayback Machine saved a snapshot), I hopefully registered DearHotmail.com in case any anyone wanted to use that example as well, but nothing ever coalesced around that. Meanwhile, some random mis-fire seems to have cancelled out some other random mis-fire, and Hotmail is apparently no longer blocking my mail, at least until this article gets published.
As far as I can tell, the only reason Hotmail got off scott-free and AOL/Goodmail didn't, was that Hotmail snuck their system in quietly, while AOL and Goodmail announced their partnership with great fanfare, apparently overestimating the extent to which e-mail publishers would greet them as liberators. This doesn't reflect very well on the outrage grapevine, people.
But the lesson took -- when Goodmail recently announced their partnership with four more e-mail providers, Goodmail featured a press release on their own site, but of the four ISPs, Verizon was the only one issued their own press release. Apparently the other three saw what happened with AOL/Hotmail and got the message.
You didn't ask, but my own idea for an anti-spam system would be to follow a protocol such that when you reply to a list server to confirm your subscription, the reply goes to an address like:
list-peacefire-confirm-481534893-sender=bennett=peacefire.org@mailserver.com
When you send that reply from your Hotmail account, Hotmail would see the "sender=bennett=peacefire.org" part of the address you're replying to, and recognize that to mean that you want to receive future messages sent from bennett - at - peacefire.org. So future messages from that address would be weighted not to be blocked as spam for that user. It wouldn't do anything to unblock person-to-person messages that get blocked as spam, but those are not mis-blocked as often as legitimate newsletters are, and this method would give newsletter publishers a way to get whitelisted at the same time that the user confirms their subscription. It wouldn't be perfect, since if the user then unsubscribes from the newsletter, but bennett - at - peacefire.org is a jerk and continues to send them mail, that mail would still get through because the Hotmail filter for that user still "remembers" that they confirmed their subscription, and doesn't know that they unsubscribed. However, the vast majority of nuisance spam comes from people you've never heard of, not from people whose newsletters you signed up for and then continued to send you mail after you unsubbed.
Or, suppose you're Amazon and you send mail to millions of users from orders@amazon.com, but you don't want everyone to have that address whitelisted because then a spammer could use the address "orders@amazon.com" to spam millions of people, hoping it would get through the filter of anyone who's an Amazon customer. So in that case people could confirm by replying to:
list-peacefire-confirm-481534893-sender=orders=amazon.com&senderip=72.21.203.1@mailserver.com
When the user sent their reply to that address, Hotmail would parse out the "sender=orders=amazon.com" part and the "senderip=72.21.203.1" part, and whitelist future mails from that address that come only from that IP.
I like this idea because it treats everyone equally, regardless of wealth or popularity, as long as they confirm subscriptions to their newsletter (which is regarded as good mailing list hygiene anyway). On the other hand, if you prefer filtering systems that work better for people who are rich and never offend anybody, then you'll be pleased to know that those seem to be winning.
-
Will AT&T Start Filtering Your Connection?
We have another essay from Bennett Haselton for you to peruse. "Last week's coverage of AT&T's newly announced "anti-piracy initiative" mostly downplayed the key part of AT&T's proposal, which is filtering what their end users can access in the first place, not finding pirates or suing them after the fact. Friday's Associated Press article, which was reprinted on many news sites with headlines like "AT&T to Help Hollywood Track Down Internet Pirates" and "AT&T to ID Offshore Web Pirates", actually said only that "the effort is primarily aimed at pirates who set up operations in other countries" -- and since you can't really "aim" at pirates in Russia and China with anything except missiles, the statement suggests not identifying pirates or tracking them down, but pre-emptively blocking people from connecting to their servers. Only the Red Herring nailed it with their article title, "AT&T to Block Pirated Content"." Follow the magical URL to read the rest of Bennett's words on the matter.I think this is a crucial distinction, because efforts to filter end users' connections (as opposed to making them pay consequences for their actions after the fact) have always been controversial, even when the content is illegal. The Center for Democracy and Technology successfully overturned a Pennsylvania law that required ISPs to block overseas child pornography sites, partly on the grounds that the filtering included many third-party Web sites as collateral damage. I've argued that a similar private-sector initiative called Canada Cleanfeed, where Canadian ISPs attempt to block child pornography Web sites, would do more harm than good. On the other hand, nobody's fighting very hard for the cause of child pornography downloaders who were caught and arrested. Web sites get sued and shut down all the time, but it was bigger news when Canadian ISP Telus blocked the Web site of a Telus labor union for three days. So it's a big deal whether we're talking about "pre-emptive" filtering, or fighting piracy "reactively" by going after violators.
AT&T Senior VP James Cicconi said in e-mail that "discussion about what the technology will or won't do is premature until we can invent it", but most of the hints so far have been that the anti-piracy technology will be "pre-emptive", i.e. filtering users' connections. Cicconi said on a conference panel that AT&T has to spend billions on network maintenance to carry illegal pirated traffic -- which they probably couldn't recoup by suing people, so the only way to prevent that would be to block it. And Cicconi has referred to the technology several times as a "network-based solution" -- but what else could that mean, except filtering?
So let's assume that's what's on the horizon. Interestingly, Cicconi said that AT&T did not plan to block actual Web sites. However, he said in e-mail, "If one could, with a high degree of certainty, spot and isolate illegal traffic from an offshore site, would you not think the copyright holders would have a reasonable argument for a court order to block that traffic (as opposed to the site itself)?" Presumably this could refer to a Web page with an index of links to BitTorrent files -- so they'd be willing to block the BitTorrent links, but not the Web page? But from that point of view, why not just block Web sites too? If an overseas webpage has a list of links to pirated content, and that content is served over http from the same Web server, wouldn't they want to block it?
But I doubt this would stem much piracy in the long run, because connection filtering to fight piracy became more commonplace, then the next generation of p2p file-trading programs would all just have circumvention capabilities built into them, that let you route your connection through a friend at an unfiltered ISP. You're on AT&T, you upload a file to your friend on Verizon which earns you some "credits" with his node in the p2p network, and instead of redeeming those credits to download a file from him, you use his node as a proxy to download a file indirectly from a site in Russia that AT&T is blocking you from accessing. Advanced users can do this already with tools like Virtual Private Networks and Tor, and some tweaks in a p2p program would just bring it within the range of the casual user.
On the other hand, if AT&T starts filtering traffic, it could set a bad precedent that any time a party in a legal proceeding wants a site declared "illegal", they can demand that AT&T (or other ISPs) block the site. It could be a site libeling a person, or a site hosting a decryption tool that breaks some company's poorly-designed code, or pretty much anything that some powerful person wanted to go away. Meanwhile, if an AT&T customer did get accused of downloading pirated content, now they could invoke the "AT&T didn't stop me" defense -- they thought that AT&T was filtering illegal content, and if they could get to it, then that meant it was legal! In both cases the problem comes from someone using the argument that once AT&T started doing any filtering at all, they should have gone further.
So I would watch the situation closely, even if you're not an AT&T user, and don't assume the situation will take care of itself. Cicconi said, "If a company like ours does dumb things and upsets our customers, we will lose them to someone else," which is something I'm skeptical of whenever I hear it used to defend various draconian anti-spam measures, but in this case I think it's even less applicable. When you're talking about spam filters, at least they always bring some benefit to the user (less spam), and the question is whether the free market weighs those benefits properly against the costs (more lost mail). On the other hand, if an ISP filters the user's connection, that brings no benefit to the user, and in a truly efficient market, all customers of such an ISP would just switch to an unfiltered one -- if that doesn't happen, it simply means the market in that case is not efficient. Is your ISP filtering your connection right now? Probably not, but how could you tell if they were? Right now we assume that ISPs don't filter connections because generally it's "just not done" (except when it is). In a few years we might not be so sure.
-
What Happens If You Don't Pay for Goodmail?
Bennett Haselton has written in with his latest report. He starts "Goodmail has announced partnerships with four new ISPs who will charge for "reliable" delivery of your e-mail messages if you want to bypass their spam filters. The news will probably generate another round of editorials like the ones written a year ago about AOL's plan to use Goodmail, including this one from Esther Dyson (for it) and this one from the EFF (against it)." Follow the magical clicky clicker below to read the rest of this story.If I could ask one serious question of anyone who was defending pay-per-email, or sitting on the fence about it, this would be it: Suppose you sent an extremely urgent e-mail to your doctor or your lawyer, who for the sake of argument you're not able to reach by phone. The recipient's ISP owner happens to see the message before the user retrieves it, and realizes how urgently you need to get it through. So he moves it to the recipient's "spam" folder, and then calls you up and says: pay me $1,000 to move it to the recipient's inbox, or they'll never see it.
Does the ISP have the right to do that? If not, why not?
Perhaps you'd say that Goodmail's 1/4-penny-per-message is reasonable, but $1,000 for one message is too much. But then who decides what is "too much"? The marketplace? Then isn't the ISP admin just another player in the market, and $1,000 is what they want to charge? If you don't like it, you can go somewh... oh, wait, you can't, because there's no other way to get through to the recipient. If you ever get through to your doctor or lawyer, they might switch ISPs after they hear what happened, but should that be your only recourse?
The problem with the ISP charging $1,000 to deliver your message is not that $1,000 is "too much", but that they're charging for a service that has already been paid for. If your doctor or lawyer pays for an e-mail address, they're doing so with the understanding that their ISP will make a reasonable effort to deliver the non-spam e-mails that people try to send them. If their ISP then turns around and asks you for $1,000 to deliver the e-mail, then they're trying to double-bill for the same service, and if they block the message because you don't pay the $1,000, then the ISP is cheating the recipient out of a service that they've already purchased. And it's not just the recipient being cheated; if the recipient has an arrangement with you, as your doctor or lawyer would, then the ISP is interfering in their business relationship with you.
Now, if an ISP using Goodmail offers to let you bypass their filters by paying 1/4 penny per message, how is that different from the doctor example? Well, on the face of it, it's different in at least two ways: first, because the ISP is charging "only" 1/4 penny per message instead of $1,000, and second, they're not saying that your mail will be blocked if you don't pay, only that it might be. But are these qualitative differences, or just differences in degree?
Take the cost-per-message. I have a (verified opt-in) mailing list of about 50,000 people that I send mail to twice a week. In the aggregate, it is just important for me to get mail out to those subscribers, as it is for some people to get a single mail through to their doctor or lawyer. Also, in the aggregate, it would cost me about $1,000 per month if the ISPs collectively asked for 1/4 penny per message and threatened to block them otherwise. So is there any real difference between requesting $1,000 to unblock 50,000 e-mails, and requesting $1,000 to unblock a single e-mail, if you're just doing it because you know the sender urgently needs to get them through? (It's not a reflection of the ISP's costs -- downloading and storing 50,000 messages at 3 K each, costs almost nothing, certainly not anything close to $1,000. And again, I would argue it's a moot point anyway, because those services have already been paid for.)
And how much difference is there, really, between saying that a message (or a group of messages) might be blocked, and saying that a message definitely will be blocked? If it's bad for your doctor's ISP to call you up and say, "Give me $1,000 or there's a 100% chance that your message doesn't get through," what if they say, "Give me $1,000 or there's a 50% chance that your message doesn't get through," isn't that at least 50% as bad? You could say that in my doctor example, the blocking was deliberate, but in the case of the spam filter, it's accidental. But if an ISP chooses not to fix problems with its spam filter, then in a way it's still deliberately creating a certain percentage of cases where the spam filter will block legitimate mail, even if those cases occur at random.
There is one more difference between Goodmail and the scenarios I've described, which is that Goodmail not only lets you bypass an ISP's spam filters, it also certifies that you are trusted and not a phisher. If an ISP like AOL controls the user-interface that a user uses to check their mail, it can display the blue-ribbon "CertifiedEmail" icon next to a Goodmail-certified message. In this case, an ISP can plausibly claim that they're letting all legitimate e-mail get through, but they're still offering a benefit to Goodmail senders. The problem with this is that since phishing only works on users who are gullible to begin with, a phish could just as easily display the CertifiedEmail icon in the body of the message to try and gain a user's trust. It's all very well to say that a user should know that the CertifiedEmail icon only "counts" when it's displayed in the inbox, not in the message itself. But a user who knows that, would probably also know that their bank's Web page is not 209.211.253.169. And besides, most users of Comcast, Cox, RoadRunner and Verizon will be using their own mail clients like Eudora which won't display the "CertifiedEmail" icon anyway.
So it seems pretty clear that the main benefit of using Goodmail will be deliverability. And that's the basic Catch-22: If an ISP gives the same deliverability to non-Goodmail-certified messages, then who's going to use it? On the other hand, if an ISP gives better deliverability to Goodmail-certified messages than to other messages (much more likely), then they are to some extent misrepresenting the services they sell to their users, since users expect an ISP to make the best effort to deliver all legitimate e-mails, not just the ones from paying senders.
Goodmail likens their service to FedEx or UPS for "enhanced delivery" of paper mail as a way of getting the recipient's attention. But the difference is that if you're trying to reach your lawyer, then the office complex where he works (or the city that maintains the streets to his house) is providing the service that he expects and has paid for, namely, allowing different companies to deliver stuff to him there -- and because you have different choices, that means FedEx, UPS and the USPS have to compete with each other, and that keeps the delivery prices down. On the other hand, if an ISP blocks you from mailing their customer unless you pay their fee, then the ISP is going against what the customer expects them to do, and it is precisely that betrayal of trust that gives the ISP a monopoly on your ability to reach the customer -- which leads to them charging monopoly-style prices, like $1,000 to receive and store a few tens of thousands of messages.
There is a lot of debate about whether "the market" would fix problems of legitimate e-mail being lost. Esther Dyson's editorial was a classic libertarian defense of the free market as the arbiter of systems like Goodmail: "If it's a good model, it will succeed and improve over time. If it's a bad model, it will fail. Why not let the customers decide?" Actually I don't think the free market does fix most e-mail deliverability problems -- I've been involved in a few business that sent bulk e-mail (to subscribers who requested it and confirmed their subscriptions), and have had conversations with dozens of others, and we've all had problems sending to Hotmail, AOL, and Yahoo, and I've never, ever heard anyone say that their deliverability problems were solved by "the market". (Usually the problems just come and go, and nobody knows why.) But in a way this is all beside the point. Even if the market would stop more egregious abuses, what gives ISPs the right to charge senders for e-mail services that their customers have already paid for?
I actually met Richard Gingras, the CEO of Goodmail, and Charles Stiles, the postmaster of AOL, at a conference in Seattle last year where they were on a panel defending against the Goodmail controversy. They seemed like nice guys who were genuinely blindsided by the criticism that Goodmail had been receiving. It's easy to see the point of view of Goodmail's defenders -- if Bob wants to pay Alice to "certify" Bob, why would it be anybody else's business? It isn't, until it leads ISPs to steer people towards a system where if you want to be treated like a non-spammer, you have to pay -- even if, strictly speaking, the recipient is already paying to receive your mail.
As for the much-vaunted free whitelisting privileges that non-Goodmail senders will continue to enjoy, in the pre-Goodmail era I once found that AOL was blocking some of my mail to their users, so I called their postmaster department and learned the following facts:
- The first person I talked to, said that he checked the logs and our mail was being blocked because we didn't have reverse DNS set up. I thought this was odd because we did have it configured, but I thanked him and hung up.
- Then, I called back and got someone different. I asked them the same question and they said that according to his logs, our mail was being blocked because someone else at our ISP was sending spam. I asked him why they were blocking our IP address, if it was different from the IP of the alleged spammer; he paused and said, "Is there anything else I can help you with?", and this repeated several times as I thought my phone or his headset wasn't working, before I realized he was just being a dork.
- Then, I called back and got yet another person, and this person said that he could see our mail was being blocked because it contained banned content. I was pretty sure that was wrong, because you get a different-looking bounce if you're sending mail that contains a banned string, but I took a note of that anyway.
- Then, I called back and got a fourth person, who said that our mail was being blocked because some of their users had flagged mail from our IP address as spam. He paused for a brief conversation in the background, then came back and added, "This has already been explained to you, sir." I said that since I had gotten four different explanations in four different phone calls, I figured I could just keep calling and tallying the votes that I got for each explanation, until one of them emerged as the winner.
Much later I found out from someone else about the AOL whitelisting program, which I'm currently trying to see if it prevents us from getting blocked. But if none of the people answering the phone at the postmaster department knew or told me about it (and I confirmed that it did exist at the time), how many other organizations or businesses don't know?
ISPs adopting Goodmail say that while Goodmail senders can bypass their spam filters, non-Goodmail senders will continue to enjoy the same deliverability rates that they have in the past. That's what I'm afraid of.
-
How Private Are Sites' Membership Lists?
Slashdot contributor Bennett Haselton has written an essay on a subtle privacy issue affecting many websites (including Slashdot!) He says "Suppose your girlfriend called up Match.com and said, "I think my boyfriend might be cheating on me. His e-mail address is joeblow - at - aol - dot - com. Can you tell me if he's a member?" And Match.com phone support told her, "Why, yes, he is a member. You'd better have a talk with him." After you had gotten over the guilt of getting caught -- I mean, the guilt of cheating -- would you not feel like Match.com had violated your privacy by telling a third party that you were a member?" Keep reading to see what he's getting at and to decide if and when it's a problem.Something like this is actually possible with quite a few well-known sites -- given a person's e-mail address, it is possible to find out if they have an account with Match.com, PayPal, Netflix, eBay, Amazon, and Google (and, by the way, Slashdot [CT: We'd fix it if I thought it mattered]). For some of those sites, it may even be possible to take a long list of e-mail addresses and use an automated process to find out which of those addresses have accounts with those sites (something I didn't want to risk trying myself, but as a general rule, if you can do it once, you can do it many times, at least if you do it slowly enough). It does not enable the attacker to extract addresses from a site's membership rolls, which is a much more serious type of breach -- in this case, the attacker would have to already know a list of e-mail addresses, and would only be able to find out which of those addresses have accounts with a given service. And it definitely wouldn't enable an attacker to extract more sensitive information like passwords or personal data. But the ability to get a yes/no answer for whether an e-mail address belongs to a member of a given site, should be something that the site designer should take into account. I'm not even saying that it should necessarily be considered a security hole in most cases, just that it should be something that the site designers decide whether or not they want to permit it -- not something that was left in the open accidentally. Representatives from PayPal and Netflix assured me that they knew about the possibility of this attack and had countermeasures to detect it. In the case of Match.com, on the other hand, I would argue it looks like an oversight. For other sites, whether it's a security hole or not depends on your point of view.
There are three main causes for concern with this issue. The first is simple privacy -- for a site like Match.com, a person may not want other people to be able to find out that they're a member. The second is the possibility of making phishing attacks easier. If a phisher sends spam to a huge number of recipients, hoping to trick them into entering their login details on a counterfeit site, then generally their success rate would be proportional to the number of recipients who are members of that site (of which a certain percentage will be duped into entering their login info), but the speed at which the phishing site is shut down would be proportional to the total number of recipients (since any recipient would carry the same likelihood of reporting the phishing site to an ISP and helping to get it shut down). So if the phisher could find out which addresses on their list belong to actual members of a given site, and send mail to just those people, they could get more successful attacks in proportion to the number of e-mails sent. This is especially true of "puddle phishing" attacks, where only a small percentage of recipients are likely to be members of the site being phished. The third possibility is that the data could be valuable to spammers wanting to advertise a competing site -- a spammer advertising a dating site, for example, could get more band for their buck by advertising only to Match.com members. (Maybe even try a hybrid spam-with-just-a-hint-of-phish -- spam that says "Rejected a lot on Match.com?" to make the user think at first that the e-mail really is from Match.com, but then steer them towards a competitor.)
With a build-up like this, the attack is disappointingly simple. (In fact, I listed the possible consequences of the attack first, because otherwise the attack itself is too easy to dismiss.) If you haven't already guessed at least one of these methods, the three easy ways to find out if an e-mail address is associated with an account at a given site, are:
- Try to create a new account with that e-mail address. See if you get an error message saying the address is already associated with an account.
- Log in under an existing account, and try to switch to another e-mail address. See if you get an error message saying the address is already associated with an account.
- Use the forgot-your-password feature to request a password be sent to a given e-mail address. See if you get an error message saying that address is not associated with an account.
With most popular sites that I tested, at least one of the above methods fail, but at least one other method succeeds. On Netflix, for example, the forgot-your-password form requires you to enter a last name and a credit card number, so that form can't be used to find out who is a member. On the new member signup page, though, you can enter an e-mail address and be told whether that e-mail address already belongs to a member. With Match.com, on the other hand, I already mentioned the weakness in the password-reset form, but if I tried to sign up for a new account but I didn't correctly pass the Turing test (reading numbers off a graphic and entering them in a text field), Match.com wouldn't tell me if the e-mail address was associated with an existing account. So that form could not be used to sift through 100,000 addresses and find which ones were Match.com members, but it could be used to find out if an individual person was a subscriber.
There are at least two simple countermeasures to this type of attack. The first is to require a Turing test when a user creates a new account, requests a password reset, or changes their e-mail address on file, and make sure that if the Turing test isn't completed correctly, then no error message is displayed about whether a given e-mail address does or does not exist in the system. This makes it hard for attackers to sift through a mountain of e-mail addresses finding out which ones already belong to accounts, but it still enables someone to check if someone is a member, one person at a time. For sites where that would be a privacy concern (again I'm thinking of Match.com), the other solution is better: send an error message to the e-mail address entered, not displayed to the user in their browser. If you try to sign up as joeblow@aol.com, and that address is already associated with an account, then display the normal message telling the user to check their inbox for confirmation -- but then send them a message saying their address is already in the system. eBay, for example, gets this right on their "forgot your userid" page -- if you enter an e-mail address not associated with an eBay account, it simply says, "eBay just sent your User ID to joeblow@aol.com. Check your email to get your User ID." (On the other hand, eBay's new user signup page lets you check if an e-mail address is assigned to an existing member, without needing to pass a Turing test.)
Netflix, eBay and PayPal also responded to say that they had monitors in place to detect "suspicious" activity, saying that even in cases where the forms did not require a Turing test, they could dynamically detect if someone were using a script to submit the form over and over to harvest data, but they declined to go into more detail. It seems to me this could work for forms that require you to be logged-in, but not for forms that don't. For example, on the Netflix new user page, how would they detect if it's the same person submitting e-mail addresses over and over again? Not by IP address -- you can use Tor and farms of open proxies scattered across the Internet to make it appear as if you're coming from lots of different IP addresses. However, consider the PayPal add-a-new-email-address form. This form does not require a Turing test, and does give you an error message if you try to add an address associated with another account. At first I thought this might be a loophole that an attacker could use to find all the PayPal users in a long list of addresses, but PayPal told me that if you do this enough times under the same account, eventually you will hit a limit where the form starts requiring a Turing test. I never got high enough to hit that limit. However, in this case the "dynamic detection" could actually work -- because you can only perform this action while logged in, and after you hit the limit, to continue testing more addresses would require another PayPal account -- and creating additional throwaway PayPal accounts does require a Turing test for each one. So I'll take their word for it that that attack is blocked, although, it seems to me it would be easier just to require a Turing test on the add-a-new-address page.
On the other hand, perhaps in the case of a site like Netflix, it's not something that users really need to worry about, if the company has no problem with it. Big deal, an attacker can find out whether you're a Netflix user -- but that's not a huge privacy violation, it's not like I shamefully hide those red envelopes under my shirt while I'm scurrying back from the mailbox. Now, a spammer can take a list of addresses and run them through the form to find out who is a Netflix customer, and then spam those users trying to lure them to a competing service -- but that's Netflix's problem, not ours, isn't it? (Well, it's our problem that we get the spam. But without using this attack, the alternative was that the spammer was just going to spam everybody on their list anyway, so by that argument, this attack actually results in less spam all around!)
Except... perhaps an attacker could try the third type of attack, a phishing attack to get people's Netflix usernames and passwords, but not in order to compromise their Netflix account, rather to see if the person has an account with the same password at eBay or PayPal. Perhaps a user would be wary of a PayPal phish since they see so many of them, but they might fall for a Netflix one -- although then the attacker's success would be limited to people who had Netflix and PayPal accounts, and were using the same password for them both...
So it seems to me it's not obvious when this should be considered a problem. (All of the sites mentioned in this article were e-mailed about this issue months ago, and so far none of them considered it a serious enough threat to block all three of the avenues of attack listed above.) If abuse of this type becomes common, perhaps eventually these "queryable membership lists" will come to be considered in the same way as open mail relays -- which were never considered a glaring security hole, but were abused in ways that triggered a shift in people's thinking that got them to be gradually phased out, going from open relays being the default standard up to the early 90's, to the point where many ISPs today prohibit customers from running them. Maybe "queryable membership lists" will start to be abused more, if anti-spam technologies get smart enough that spammers can't send 1 million messages at a time any more and have to limit themselves to, say, 100,000 messages at a time to get through people's filters, so they have to pick which 100,000 of their addresses they could get the most value out of. Or maybe things will go in a completely different direction and this will never become a problem. I just think that, for now, we should be aware that some form of this trick works on the majority of sites that require an account, and the types of abuses described are at least possible.
-
Who's Trading Your E-mail Addresses?
Bennett Haselton is back with another piece on e-mail privacy. He starts "On April 14, 2007, I signed up for an AmeriTrade account using an e-mail address consisting of 16 random alphanumeric characters, which I never gave to anyone else. On May 15, I started receiving pump-and-dump stock spams sent to that e-mail address. I was hardly the first person to discover that this happens. Almost all of the top hits in a Google search for "ameritrade spam" are from people with the same story: they used a unique address for each service that they sign up with, so they could tell if any company ever leaked their address to a spammer, and the address they gave to AmeriTrade started getting stock spam. (I don't actually do that with most companies where I create accounts. But after hearing all the AmeriTrade stories, I created an account with them in April just for the purpose of entering a unique e-mail address and seeing if it would get leaked.)" Bennett continues on if you're willing to click the link.What's surprising is that as far as I can tell, AmeriTrade has taken almost no heat in the media for letting this happen. Despite the abundant testimonials from bloggers who had their addresses leaked, the story never crossed over into the "mainstream" Internet press. In a recent Bloomberg News story, the FBI warned that E*Trade and AmeriTrade users were vulnerable to spyware installed by criminals in hotels and cybercafes to capture accounts and run pump-and-dump stock spams; no mention of the fact that all AmeriTrade e-mail addresses were apparently already in the hands of spammers anyway (although no one knows if usernames and passwords were leaked to the spammers as well).
This doesn't bode well for anyone who uses any type of online service and wants that service to keep their personal information secure. If AmeriTrade got skewered in the media for leaking customers' personal information to spammers, other companies would see that and learn the lesson. On the other hand, if AmeriTrade gets away with it with barely a whisper in the mainstream news, other companies are going to take note of that, too. Besides, spam and identity theft hurt everyone, not just the victims, because the costs are passed on to all of us in terms of higher ISP charges, higher payment processing fees, and more mail lost due to stringent spam filters.
AmeriTrade disclosed in April 2005 that a tape containing some customer information might have been stolen in February of that year, and many spam victims who blogged about their AmeriTrade addresses being stolen, referenced that incident as the likely cause. But after Bill Katz's blog post became a clearinghouse of sorts for complaints about stolen AmeriTrade addresses (probably as a result of being the first match on Google for "ameritrade spam"), several users posted that they had received spam at accounts that were only created with AmeriTrade in summer 2006. And then my e-mail address got leaked between April 14 and May 15, 2007. So it's pretty clear that some attacker has access to the AmeriTrade customer database on an ongoing basis, and the February 2005 tape theft probably had nothing to do with it.
AmeriTrade says that California law required them to notify their California customers of a potential security breach after the tapes were stolen, and that they went further and notified all of their customers anyway. Since there is now proof that their database is more or less perpetually open to some outside attacker, will they send out another notification letter to customers?
An accidental security breach can happen to any responsible company, especially if they are compromised from the inside. But the trail of blogosphere and UseNet posts indicates that several times AmeriTrade has concealed the full extent of the problem from customers who asked them about it, or has given out information that they already knew was wrong. In one thread in October 2005, a user reported that they wrote to AmeriTrade asking why their AmeriTrade-only e-mail address was getting spammed, and AmeriTrade replied that the spammer might have guessed the address using a dictionary attack, adding:
We have no reason to believe that any of our systems have been compromised. Ameritrade deploys state of the art firewalls, intrusion detection, anti-virus software as well as employs a full time staff of employee's dedicated strictly to Information Security and protecting Ameritrade's systems from unauthorized access.
But that was long after February 2005, when AmeriTrade said that tapes containing customer data were stolen. (Even if that turned out not to be the cause of the spam after all, by that point AmeriTrade knew that their customers' addresses had been leaked somehow.)Then when my friend Art Medlar complained to AmeriTrade this year about the same thing happening, he got a response saying that even if he was getting spammed by an address that he only gave to AmeriTrade, that could be the result of hackers "implanting 'bots' that have the ability to extract e-mail addresses from your computer, even when you have protective spy software engaged". But of course this makes no sense -- if this were the source of the problem, it would affect everyone's e-mail addresses equally, and would not explain why a disproportionate number of complaints were coming from people who created addresses that they gave to AmeriTrade specifically.
When I sent AmeriTrade my own inquiry, I got a response that was identical to a forwarded message that someone else posted to news.admin.net-abuse.email in April. (To their credit, in this version of the message, AmeriTrade is acknowledging responsibility for the problem instead of attributing it to dictionary attacks or botnets. But the e-mail contains the curious piece of advice: "Please be sure to delete any spam you might receive, then empty your e-mail's trash so that it's no longer kept there, either." Huh? As one reader replied to the UseNet thread: "Cynical Translation: Please don't retain any independent evidence.") At first I didn't realize this was a boilerplate response, so I sent back some more questions, asking, for example, whether they would notify their California customers of the data security breach as required by that state's laws. The second response I got was a copy of the old boilerplate that they were sending out two years ago, blaming "dictionary attacks".
Now, compared to the 1,000 spams I already get every day (pre-filtering), the AmeriTrade spams were just a drop in the bucket, and many of their customers are probably in the same boat. And unlike most AmeriTrade customers, at least I can stop all AmeriTrade spam just by de-activating those addresses, since they aren't used for anything else. (Right now I'm keeping them open just to see what else comes in.) But AmeriTrade's database also contains much more valuable information such as names, PIN numbers (do you use the same PIN number everywhere that you sign up?), and Social Security Numbers. When I signed up for my account, informed by dire warnings that federal law required accurate information "to help the government fight the funding of terrorism and money laundering activities", I gave AmeriTrade my real SSN, address, and other personal data, figuring that if I gave them false information, I might get in more trouble than the experiment was worth. But now that the attacker has my e-mail, they might have all of my other information as well. In the coming months I'll probably start checking my credit report more often than I used to.
Probably someone inside AmeriTrade is selling customer data to an outside spammer. (It seems less likely that an attacker would keep breaking into AmeriTrade repeatedly to get updated copies of the customer list. Once you've broken in and gotten the customer database from 2006, why bother breaking in a year later, taking the risk all over again of getting caught and going to jail, just to get the updated 2007 database? Surely the 2006 list would be enough to run any pump-and-dump stock scam that you want!) Two suggestions to AmeriTrade to tighten their security: First, the number of people within the company who can access the customer database, is probably a lot larger than the number who actually need to access the customer database. Limit access to the e-mail database to people who actually need it. Second, in any cases where different employees really need to have access to the list, try giving them different versions of it, where each version is "seeded" with spamtrap addresses at Hotmail and Yahoo Mail. If the spamtrap addresses that start receiving spam are all ones that were used to seed one particular employee's copy of the list, then you've found the source of the leak. That won't stop the spam being sent to addresses that have already been stolen, but it could prevent further leaks from happening.
The SEC recently announced that they would suspend trading of companies whose stocks had been the target of spam campaigns to manipulate the price. Perhaps AmeriTrade could do something similar -- once a stock is identified as being promoted in spams sent to AmeriTrade customers, any customer attempting to buy that stock would be presented with a message saying that AmeriTrade was blocking the transaction for security reasons. (If this runs afoul of some SEC regulation that a brokerage has to let you buy any stock you want any time you want, then at least display a big warning when AmeriTrade users try to buy it through their system, saying that the stock has been the subject of a fraudulent promotion scheme and is an extremely high-risk buy.) However, while this would remove the incentive for stock spammers to target AmeriTrade customers, it's also really just covering up a symptom of the problem, rather than addressing the problem itself, which is that a spammer was able to steal the customer information from AmeriTrade's database in the first place.
But whatever they do, AmeriTrade should stop blowing off the people who complain about the spam, with messages about "dictionary attacks" and "botnets". When customers create specialized spamtrap addresses to detect if their e-mails ever get leaked, those are the tech-savvy customers who (a) know what they're doing, and (b) hate spam more than most people, and giving them misleading information is just poking a stick in their eye. Not a smart move when AmeriTrade has been leaking private customer information and is based, as their name indicates, in the most litigious country in the history of the world.
-
Why Are CC Numbers Still So Easy To Find?
Frequent Slashdot contributor Bennett Haselton gives the full-disclosure treatment to the widely known and surprisingly simple technique for finding treasure-troves of credit card numbers online. He points out how the credit-card companies could plug this hole at trivial expense, saving themselves untold millions in losses from bogus transactions, and saving their customers some serious hassles. Read on for Bennet's article.
Some "script kiddie" tricks still work after all: Take the first 8 digits of a standard 16-digit credit card number. Search for them on Google in "nnnn nnnn" form. Since the 8-digit prefix of a given card number is often shared with many other cards, about 1/4 of credit card numbers in my random test, turned up pages that included other credit card numbers, and about 1 in 10 turned up a "treasure trove" of card numbers that were exposed through someone's sloppily written Web app. If the numbers were displayed along with people's names and phone numbers, sometimes I would call the users to tell them that I'd found their cards on the Internet, and many of them said that the cards were still active and that this was the first they'd heard that the numbers had been compromised.
Now, before this gets a lot of people mad, let me say that at first I was planning on holding off writing about this for months if necessary, to give the credit card companies time to do something about it. In other words, I actually had the presumptuousness to think that I had been the first one to discover it, but only because the credit card numbers that I found were still active. (If the trick had been widely known, I reasoned, surely the credit card companies would have found any credit card numbers listed in Google before I did, and gotten them cancelled.) Then I found that the trick had been publicized about three years earlier in a C-Net article by Robert Lemos and was probably widely known even before that. (The article stops just short of describing the actual technique, but one reader posted the full details in a follow-up comment.) Another article from that year in CRM Daily describes an even more efficient trick: Googling for number ranges like 4060000000000000..4060999999999999 to find Visa card numbers beginning with "4060". Google has now blocked that trick, so that trying that as a Google search leads to an error page. But the basic technique of Googling for working credit card numbers, apparently still works. In other words, credit card companies have apparently known about this technique for at least three years, probably longer, and presumably have hoped it would continue being swept under the rug.
At this point, I think the right thing to do is to shine a light on the problem and insist that they fix it as soon as possible. It may result in a short-term spike in people using this technique, but if it results in the problem being fixed, then the total number of fraud incidents will probably be less in the long run.
It would be simple for companies like Visa, MasterCard, and Discover to take a list of the most common 8-digit prefixes, query for them every day on Google, and de-activate any new credit card numbers that were found that way. (American Express cards are apparently not vulnerable to this trick, because when their 15-digit card numbers are written with spaces, they are usually written in the format "3xxx xxxxxx xxxxx", and Googling for the first 10 digits as "3xxx xxxxxx" didn't yield anything in my random test of ten AmEx numbers. But this is still their problem too, since the searches that turn up "treasure troves" of card numbers usually include AmEx numbers as well.) A Perl programmer could write a script in one afternoon that could run through all the known 8-digit prefixes, parse the search results, and pick out any URLs that weren't listed as matches the day before. From there, the search results would have to be reviewed by a human, in order to spot any situations where one credit card number was exposed at one URL, and a slight variation on the same URL (such as varying an order ID number) would expose other credit card numbers as well, which was the case with several of the hits that I found. Simple, but time-consuming with so many different 8-digit prefixes -- but every minute of effort expended on tracking down and canceling leaked credit card numbers, would save time and grief later by preventing the numbers from being used by criminals. If it would save them time in the long run and help prevent fraud, then why don't they do this?
It's considered good etiquette among security researchers, when finding a new security hole, to give the affected companies a chance to fix the issue before publicizing it. When I first contacted the credit card companies and described exactly how the exploit worked and how to block it, after getting a polite "We can't comment" from each one, I figured I'd give them a few months to get a system in place that could find leaked cards on a daily basis and de-activate them before they could be used. But then I found the C-Net article from 2004, and figured that if the card companies hadn't taken action in three years, it was fair game to publicize the trick in order to increase the pressure on them to plug the gap. Of course, it's not the card companies' fault that these card numbers are leaked onto the Web; it's the fault of the merchants that allowed them to get leaked. But the credit card companies are the only ones who are in a position to do something about it.
I did try the "Good Samaritan" approach, calling the credit card companies when I found one of their customers' card numbers on the Web. For each of the four major card companies, I called their security departments and reported two of the cards that I had found compromised, and then a week later, called the cardholders themselves to see if the card companies had notified them. Surprisingly, of the four companies, American Express was the only one whose customers in this experiment, when I called them a week later, said that AmEx had contacted them and told them to change their numbers. But even if all four credit card companies were more proactive about acting on reports of leaked numbers, the problems with scaling this approach are that (a) I usually had to wait on hold for a few minutes with each company and then spell out each card number that I'd found, which doesn't scale for a large number of stolen card numbers, and (b) if lots of people started doing this, then the credit card companies would be inundated with duplicate reports about the "low-hanging fruit", card numbers with common prefixes that appear near the top of some Google search result. Both problems could be avoided if the card companies simply ran their own script that queried Google and brought up a list of any indexed card numbers, whereupon an employee could copy and paste the numbers into an interface that would flag the cards instantly.
Google does have a feature where you can request the removal of pages that contain credit card numbers and other personal data such as Social Security Numbers. Any pages that I found containing credit card data, I submitted for removal, and Google did handle each removal request within two days. But this doesn't guard against the possibility that someone might have found the credit card information before it was removed, and of course it doesn't mean that other search engines like Alta Vista (remember Alta Vista?) might not have indexed the same pages. Running a sample of 8-digit prefix searches on Alta Vista, I found about as many credit cards as I found through Google, including some pages that were not in the Google index (maybe Google never indexed them, or maybe they had removed them already). So removing a page from any engine's search results is more like covering up a symptom of a problem than fixing the problem itself, which is the fact that the card number was leaked to the Web in the first place.
If nothing else, this is another reminder of how terrible the security model is for credit card numbers as a token of payment -- one universal piece of information shared with every merchant, that can be used for unlimited unauthorized charges if it gets compromised, until someone notices. About the only desirable property of credit card numbers from a security point of view is that they can be changed, and most of your existing recurring billing relationships will carry over, but even that is a hassle. Several credit card companies do provide the ability to generate single-use credit card numbers, each one authorized only for a limited purchase amount. The problem with that is that as any security analyst will tell you, if it takes even one extra step, most people won't bother -- as long as all-purpose credit card numbers are the default, that's what most people will use. Perhaps incidents like this will push people towards more 21st-century-aware styles of payment (like PayPal, but without all the horror stories), where you can pay a bill through a system that debits your card or your bank account, without sharing all your information with the merchant.
But in the short term, as long as credit card numbers are still with us, the card companies should make more proactive efforts to find and deactivate the ones that have been leaked on the Internet. If the card numbers are found to be leaked by a clumsy Web interface on one company's site, then that company should be chastised by the card companies that issued them a merchant account. If the numbers are found together in a list posted on some third-party forum, then the companies can cross-reference the charge history against each card in the list, to narrow down which merchant may have been responsible for the leak. I'm sure the card companies do something like this already when they find a list of leaked cards; what they don't seem to be doing is acting aggressively enough to find the leaked numbers in the first place.
Maybe the real moral is not the insecurity of credit card numbers, but the value of transparency and online community relations. If MasterCard had been a hip company like Wikia, some volunteer probably would have discovered this attack very early, and another volunteer would have written an open-source tool to find and deactivate leaked MasterCard numbers automatically, and the problem would have been solved ten years ago. In fact many tech companies, if you report a security problem to them, will thank you and fix it immediately, and some of them will even offer you cash if you find any more, like Netscape used to do with their $1,000 Bugs Bounty program. We get so used to big companies having obvious holes in their security practices and answering every question about security with a flat "No comment", that we forget it doesn't have to be that way -- transparency is not just trendy, it works. After years of having bug hunters poke at the Netscape browser, the security may not have been perfect, but it didn't have any security holes that were as simple and obvious as to be analogous to finding credit card numbers on Google. -
Even My Mom Could Hack These Sites
Frequent Slashdot Contributor Bennett Haselton's latest story is ready for your consumption. He starts "Recently, as an experiment, I wrote from my Hotmail account to ten different hosting companies that were each hosting some of my Web sites, asking for logins to change the domain settings. Even though I never provided any proof that the messages from the Hotmail account were really coming from me (the address they all had on file for me was a different one), half of them replied back and gave me the logins that I needed."I figured that if I wrote to them saying "I forgot my password, please mail it to me," that would be too obvious. Instead, at the time I had set up shop with these hosting companies, I entered a domain name at the time of creating my account, and asked them to register it on my behalf (long before I had this experiment in mind). Then when I wrote to them recently from my Hotmail address, I sent each of them a message saying: I need to transfer this domain somewhere else, can you give me the login at the registrar where you registered the domain, so I can change the domain settings. Five of the ten companies either (a) gave me the registrar login, (b) transferred the domain to my registrar account on request (even though I never provided any proof that the owner of that registrar account was really me, either), or (c) changed the domain to point to a new IP address that I specified -- all of which, of course, would allow an attacker to take over a site temporarily or even permanently, if it hadn't really been me writing from the Hotmail address.
But slow down before you go off to try this out on Yahoo, eBay or Google hoping to get the same 50% success rate. First, these were all low-budget hosting companies, so the people handling my queries were likely not highly trained professionals who would have developed all the right habits about when to get suspicious. Second, this ruse only worked because the hosting companies registered the domains on my behalf. Most sites that are really worth taking over, are hosted on dedicated servers, and this trick wouldn't work on a dedicated hosting company because they usually don't register domains on behalf of customers; they assume that anybody buying an expensive dedicated server, knows enough to buy the domain and point it at the server that the company gives them.
But even for small-time hosting, a 50% success rate for a trick like this is uncomfortably high. So what can we do about it? Well, every problem has a non-solution that requires changing human nature ("People should just stop buying from spammers and they'd go out of business!") and a non-solution that ignores the economics of the situation ("ISPs should devote more resources to stopping spammers on their own network!"). In this case, the corresponding non-solutions would be (a) "People who work for hosting companies should be less gullible" and (b) "ISPs should hire smarter people, without charging more to their hosting customers".
The solution that doesn't require any cheating, though, is to have procedures in place for anything remotely security-related, and drum into employees' heads that they have to follow those procedures. Here's some good news: Of the five companies that fell for the ruse asking for my registrar login information, when I followed up with them saying "Hey, I forgot my account password, can you mail it to me", only two of them actually sent my password to the Hotmail account. To those two, I replied with some terse words about having a six-inch-thick steel door while leaving the window wide open. But at least it was only two out of ten that fell for that ruse, compared to five out of ten that fell for the registrar trick. The difference is that hosting companies have procedures in place to deal with password resets -- a script that sends the existing password, or sends a reset-password link, only to the customer's e-mail address on file.
Similarly, any hosting company that registers domains on behalf of users, should have procedures in place for transferring the domains to users or letting them change domain settings. In fact, of the five companies that didn't fall for the ruse, most of them said "Go to the customer control panel here and log in" -- it wasn't that their guard went up because I was writing from a Hotmail account, it was that they already had procedures in place for a customer wanting to change domain settings, and what's what the idiot-proof book told them to do. Kevin Mitnick always said that the weakest link in any security chain was people. Sometimes the way for ISPs to tighten security is to make the people in the chain act more like machines.
Until then, there are probably many sites out there that are this easy to "hack", using a method that could charitably be called low-tech. After seeing which hosting companies fell for the trick, I pointed out that they had sent the login information to an unverified address and admonished them to be more careful in the future, but I didn't storm out vowing to take all of my business elsewhere -- after all, if 50% of all low-budget hosting companies out there fall for this, what would be the point?
-
How to Stop Digg-cheating, Forever
The following was written by frequent Slashdot editorial contributor Bennett Haselton. He writes "Recently author Annalee Newitz created a bit of a stir with the revelation that she had bought her way to the front page of the story-ranking site Digg. Since Digg allows any registered user to go to a story's URL and "digg it" in order to push it upward through the story-ranking system, it was inevitable that services like User/Submitter would come along, where a Digg user can pay for other users to cast votes to push their story up to the top. User/Submitter says they are currently backlogged and not taking new orders, but they say the service will return and will soon feature services for manipulating similar sites like Digg competitor reddit. Even if the new U/S features are vaporware, it probably won't be long before other companies offer similar services. But it seems like all of these story-ranking sites could prevent the manipulation by making one simple change to their voting algorithm."Before getting to that though, what's at stake? The revelation that Digg could be trivially manipulated did not cause the site to be overrun with bogus stories all at once -- most of the links on the front page still look interesting. Newitz said that her story, which was deliberately chosen to be as lame as possible, got buried by users soon after it hit the front page, which is how Digg cleans spam stories out of the system. However, she also said that in the time that the story was on the front page, the story got about 35,000 hits, whereupon her server crashed and the traffic was thereafter divided with two other mirror sites; presumably if the server had stayed up, she would have gotten about 100,000 hits, all for an initial expenditure of $100, which is orders of magnitude cheaper than buying advertising any other way. (If she had done the same thing with a good story instead of a deliberately lame one, presumably the traffic gains resulting from word-of-mouth and repeat visitors would have been even higher.) As long as the benefits outweigh the cost, more and more unscrupulous users are likely to pay for such services, and since the service provided by User/Submitter is easy to copy, probably similar services will spring up to drive the price down even further. If nothing changes, then eventually sites like Digg and reddit will be flooded with nothing but paid stories. Most of the stories on the front page will probably still be interesting (why would you pay to promote a link, unless it was good enough to draw repeat visitors and get the most value for your money?), but everybody who didn't pay for votes would eventually get crowded out.
One Good Samaritan, Jim Messenger, managed to shut down one Digg manipulation service called Spike The Vote, by buying it out (for a paltry $1,275 - they must have wanted to get out fast) and then turning over to Digg. He warned people that the moral was: Don't sign up for Digg manipulation services, since Digg might get your information from them and then you'll be banned. Actually, I think the moral is simpler: if you're going to try anything like that, do it from a throwaway account that you don't care about losing if you get caught. (Or, only sign up with manipulation services which publish a privacy policy promising never to share your information, especially not with sites like Digg. Then if Digg buys them out, then the site has violated their privacy policy and Digg as the new owner inherits the liability for that, so you can sue them, right?) But as the idea spreads, it will probably become impractical to play whack-a-mole by shutting down manipulation services as they keep springing up. Any time the cost of providing a service (clicking on a few buttons) is small compared to the benefits of receiving the service (100,000 hits in 24 hours), a market will exist for it one way or another, whether you're talking about drug-smuggling, prostitution, or selling Digg votes.
However, I think there's a way to fix it, and here it is. Have you ever seen people put a link in their profile to their HotOrNot picture, saying "Go here and vote me a 10!!"? Similar to the people who send links to their friends and say, "I just posted this, please Digg this for me!" The difference is that on HotOrNot, it doesn't work. On HotOrNot, you can cast votes for a picture in one of two ways. The first way is to go directly to the URL for someone's picture; the second way is to load the front page, where a random picture from the database is selected at random, and vote for whatever picture comes up. The catch is that the votes that you cast by going directly to someone's picture, are simply ignored in calculating the average score for that photo. The only votes that are counted are the votes cast for random pictures displayed on the front page. So if you want to manipulate the voting for your own photo, you'd have to load the front page hundreds of thousands of times waiting for your own picture to come up repeatedly, which is hard to do without being detected.
To enable an algorithm like this on Digg and reddit, the sites could present users with a sidebar box that displays random stories from the pool of recent submissions. (reddit already has a serendipity feature that users can use to select a random story from the available pool, which could be leveraged for this purpose.) Once a story has collected, say, 100 votes -- or whatever number is considered sufficient to provide a representative random sample of how the story appeals to people -- then on that basis the story can either be buried or promoted to the top, where it would be seen by, say, 100,000 people. The elegance of this system is that bad content would only be seen by 100 people on average before it's buried, whereas good content would be seen by all the 100,000 people who view it on the front page, so the average user sees 1,000 pieces of good content for every 1 piece of crap. Even if 75% of users ignore the random story box completely, that just means you have to display it to 400 users instead of 100 before you have enough data points for a good random sample.
I suggested essentially the same algorithm for how an open-source search engine could work without being vulnerable to gaming even by those who understood all of its inner workings. The main difference, of course, is that Digg and reddit actually exist now. Digg declined to comment on the possible merits of such an algorithm; reddit's Steve Huffman said that the idea sounded interesting, although even if the idea got full buy-in, naturally any proposed change would take a long time to bring to fruition.
But it seems that an algorithm similar to this one would be the only way to prevent cheating on sites like Digg that sort content based on user votes. So it's ironic that HotOrNot, the only site I know of that is using a variation of this algorithm and hence is probably the most secure against cheating, is also the one where cheating is least likely to be a problem. Getting a high placement on Digg might enable you to make some money, but getting a highly rated picture on HotOrNot isn't going to make you rich (unless it helps you meet a millionaire who is using the site to find his third wife). Also, making HotOrNot meritocratic doesn't give people an incentive to improve the "content" that they submit, because up to the limits of what can be done with hair and wardrobe, you can't make yourself that much more attractive. With Digg and reddit, on the other hand, I might work harder at submitting a good story, if I knew that it worked in a perfectly meritocratic fashion that pushed good stories right to the top.
If you do this, you don't need any of the other countermeasures listed in Annalee Newitz's follow-up piece "Herding the Mob", such as analyzing user account history for suspicious behavior. As long as most users in the system are legitimate, most of the users in your random sample will be legitimate as well, and their voting will be representative of what most of the community would think. A story could also get a high score within a specific sub-area of the site like the sports page, but kept off of the main site front page, if the story got a high score from a random sampling of sports-oriented users but a low score from a sample of everyone else.
You could even sub-divide the topical areas further, down to a level of granularity like "Would Barack Obama make a good president?" A site called Helium is currently trying something like this -- users can submit essays on subjects like "Racial inequality or oppression: Do they truly exist in todays society?", and vote on how to rank other essays against each other. The voting works on the random selection principle that I'm advocating here -- users are presented with a pair of randomly chosen essays from a given category (not necessarily the same category for which you submitted an essay) and told to vote for the better one, so there's no way to tell all your friends to go to the link for your essay and give it a high rating. The main limitation though is that while the votes can push you to the top of a particular sub-category, that won't cause your article to "break out" and get to the front page of the site -- Helium says that those front-page articles are chosen at random by employees from the among those articles that are highly rated within their narrow category, so just being good is not enough. And if you want to write something that doesn't fit into any existing categories, you have to create a new category for your essay like I did, which will then be a category containing one essay that nobody else ever sees. Perhaps both of these limitations could be overcome by adding the option to rate randomly selected essays on a scale of 1 to 10 -- thus providing a way to rate essays that exist alone in their own category, and also a way to find the best essays across the entire site, rated against each other.
If Digg or reddit adopts a model that uses the random-voter-selection method, then there's the issue of how to handle the votes cast by users under the current system -- the ones who go to a story link and click "digg it", which is what makes the existing system vulnerable to gaming. Digg could do what HotOrNot does, and just ignore those votes outright, but users would probably view this as deceptive. Perhaps Digg could say that votes cast by self-selected users (the ones who go straight to the story link) are counted along with votes from randomly-selected users, unless the average of the self-selected votes is significantly different from the average from the randomly-selected votes, in which case the self-selected votes are ignored. Hopefully this would satisfy most users and preserve the "community" feel of the site, and only a spoilsport would point out that counting the self-selected votes only if they agree with the randomly-selected votes, is exactly the same thing as ignoring the self-selected votes entirely.
I asked the owner of User/Submitter what he thought about this. He was willing to talk with surprising candor (except about things like his real name) and spoke as if he'd like nothing better than for Digg to make changes to their service that would block his system from working. To both Annalee Newitz and me, he said, "We find it interesting that Digg still allows anybody to view any user's diggs. By way of this 'feature,' User/Submitter is able to verify that our users actually digg the stories they're given. Without this feature, Digg users are given complete digging privacy, and User/Submitter cannot exist." Some have expressed skepticism that the Digg cheaters really want Digg to fix the problem. But as a security tester, I can understand that mentality. If you report a problem, and a company doesn't fix it, eventually you get tempted to publicize the problem to draw attention to it. And if they still don't fix it, and it's a fairly benign security hole that merely enables some pranksters to get some undeserved attention, why not build a service around exploiting the hole, if will highlight the problem and encourage it to get fixed?
So I'm going to go out on a limb and say the U/S guy sincerely wants Digg to be more secure. However I disagree with him about his proposed fix, that of hiding a user's digg history. First of all, it won't stop anyone who creates a multitude of accounts all under their control -- you can use Tor to make it appear that you're coming from many different IP addresses, and build up a history of "legitimate" votes before using your votes to push sites deliberately. (Be sure to use different browsers, or vary your User-Agent header if you know how to do that, so that a series of votes from identical browser types doesn't give you away.) If your service does work by paying other users to cast votes, then you could still audit whether they're casting their votes honestly -- for example, create a test story, use 5 sockpuppet accounts to digg it 5 times, then tell your confederate to digg it. If the number of diggs doesn't go up to 6, then you know they're not honoring their end of the deal, and kick them out of the system. As long as most confederates think there might be some chance of getting caught if they don't play along, most of them would probably cast the votes that they were paid for, since it costs them nothing to do so and they wouldn't want to jeopardize their stream of easy money.
I asked the owner of User/Submitter if his service could defeat the random-sampling algorithm I described. "It would slow down our service," he answered, "but certainly wouldn't eliminate it because eventually a U/S User will have an opportunity to vote on a U/S Submission by way of chance." But I don't see how this would beat the algorithm -- some U/S voters would still get to vote on the story, but as long as there are far more legitimate voters than U/S voters, then a random sampling will almost always contain far more legitimate voters. The U/S owner also said, "Randomized voting privileges would be unnecessarily confusing, frustrating, and fragmenting. Not to forget: unfair and undemocratic." Well, you could keep it from being "confusing" or "frustrating" by keeping the existing interface (with the possible addition of a randomly-selected-story box), so that the only changes would be in how the votes are handled under the hood. "Fragmenting"? If anything, it seems to me that the existing Digg/reddit algorithms would be more fragmenting, keeping users within their existing communities of friend who vote for each others' stories; a random-selection box would give stories with "crossover appeal" a greater chance of success, bringing them to the attention of users who might otherwise never have seen them. As for "unfair and undemocratic", presumably this is a reaction to the fact that the votes of 100 users decide what everyone else sees. But it's already the case with Digg that the votes of a small number of users decide what content becomes popular. At least with a random sample of users, it would be the case that the vast majority of the time, the voting outcome would be the same as it would have been if the entire site had voted, due to the magic of representative sampling.
So, I'm putting this suggestion out there for the same reason that Jim Messenger bought out Spike The Vote -- because I don't want sites like Digg and reddit to be manipulated by the abusers. In fact, if they used this algorithm, they would become more meritocratic than they are now, because the systems would strictly favor the highest-rated content, instead of content written by people who have informal networks of friends who can all go digg their stories for them. If I were to design the user rating system to make it cheat-proof, these are the exact details of what I would do:
- Wherever they decide to post the "random story sampling" box (on the front page, or on a link off to a separate page, etc.), have it work so that as soon as new stories are submitted, they can be rotated into that box and displayed to a random set of users, until it's reached its total of 100 votes or however many are required to get a random sample.
- You can have "shutout voting" to kill off stories early that are obvious spam or otherwise really useless, without going through the full 100 votes. (For example, if 90% of the first 10 votes are negative, then stop collecting votes.) This decreases the number of users "inconvenienced" by really obvious spam and other garbage.
- For someone to submit content that gets rotated into that voting process, have them submit a Turing test (read numbers off of a graphic and type them in), or something similar. This prevents spammers from submitting spam content over and over just to have it viewed by those initial 10 voters. If they have to type in a number each time, it's not worth it.
- When users give votes to a story, give them the option to say why they voted the way that they did. (This is especially valuable if they're giving negative votes, then the submitter would know what to improve.) Personally I think the comments would be more valuable if each user can't see other users' comments, at the time they submit their own comments; this prevents the "me too" effect where everybody echoes the first two commenters. (When I ask for independent comments from people, and they almost all say the same thing without seeing each other's comments, that's when I know they have a point!)
- To prevent an attacker from having their own username hit the random-voting page over and over in hopes of voting up their own content, make sure that each user account is only allowed to vote on a given piece of content once (even if they found the content through the random-story page).
- Require a Turing test for new user signups. This would prevent an attacker from registering a huge number of accounts just to hit the random voting page with different users over and over, in hopes getting to vote on their own submitted content eventually.
Then after running this system for a while, look through some collected data to determine if the system could be more efficient. For example, do you really need a sample of 100 votes every time? Suppose you determine that in 99% of cases, you get the same result just from tabulating the first 50 votes, as you would have gotten from tabulating all 100 votes. Then you could modify the system to collect only the first 50 votes, and then make a decision.
Suggestions for improvement? Flaws (hopefully not fatal)? Everyone who cares about keeping community sites like Digg free from abuse, and who wants to create a path for the best content to rise to the top, let's put our heads together and see what we can think of. The above is intended merely as a jumping-off point, and although I've worked it over and I can't see any specific points to improve efficiency, that's probably just because I've been looking at it too long. And if you Digg this story for me I'll give you 1,000 times as much cash as I gave my Mom last Mother's Day.
-
Anti-Spam Suits and Booby-Trapped Motions
Slashdot contributor Bennett Haselton writes in to say "The last few times that I sued a spammer in Washington Small Claims Court, I filed a "booby-trapped" written legal brief with the judge, about four pages long, with the second and third pages stuck together in the middle. I made these by poking through those two pages with a thumbtack, then running a tiny sliver of paper through the holes and gluing it to either page with white-out. The idea was that after the judge made their decision, I could go to the courthouse and look at the file to see if the judge read the brief or not, since if they turned the pages to read it, the tiny sliver of paper would break. To make a long story short, I tried this with 6 different judges, and in 3 out of 6 cases, the judge rejected the motion without reading it." The rest of this bizarre story follows. It's worth the read.
An example of a "booby-trapped" legal brief
with the pages still joined togetherI did this after it occurred to me one day that I'd never won a Small Claims case against a spammer or telemarketer where the defendant had showed up in court. Sometimes the judges said the spammers were not liable, sometimes they said that the subject line of the spam was not misleading enough, and sometimes they simply said that they were going to make an exception under the law ("It was just one phone call"). So I asked the handful of other people in Washington that I knew had sued spammers in Small Claims, and none of them had ever won a case against a spammer or telemarketer who appeared in court either. (The only Small Claims victories had been out-of-court settlements and default judgments where the defendant didn't show up.) It wasn't because most judges said that the cases couldn't validly be brought in Small Claims court, it was simply that the number of times the defendant appeared and the judge ruled against them, was zero. Now, there were only a handful of us suing spammers and telemarketers in Small Claims, and the defendant only rarely showed up, so we're talking about a sample size of dozens of cases, not hundreds, and I'm sure some of those were cases where reasonable people could disagree. But still. Zero?
I knew when I started suing spammers in 2001 that many judges would have attitudes similar to this guy:
Judge Nault: You know what I think about these cases?
Actually, I like honesty, and Judge Nault is like the hot chick who just tells you that she doesn't like your looks instead of making up some crap about your personality. But after getting similar (but usually more subtle) messages from so many different judges, I thought it was worthwhile to test whether the motions I was filing were being read at all. The 6 test case motions were all filed as part of the formal cases, so the judges were at least theoretically required to read them -- and each one was about facts unique to that case (that is, I wasn't handing in a copy of something that I had already handed in a million times before, that wasn't why they were being ignored). I posted the complete list of all the test cases here.
Bennett Haselton: Uh... what?
Judge Nault: They stink.
Bennett Haselton: Really? Why?
Judge Nault: I don't have to answer your questions, you have to answer mine.
Bennett Haselton: OK.
[...]
Judge Nault: I just think this is the stupidest law in the world. But I didn't write the law and I'm bound to follow it. So I'm gonna go ahead and give you your money. But I'm just saying, it just takes up court time and it's absolutely stupid.I realize, of course, that courts are overburdened and judges have to prioritize what they work on. The problem I have with that excuse applied to these cases, is that often the judge spent so much time haranguing me for filing some "silly" lawsuit, that they could have read the brief forwards and backwards in the same amount of time. More likely, most judges probably just don't think spam is a real problem worth spending time on. (Obligatory rebuttal.) But, strictly speaking, that's not the judge's decision. If the legislature has passed a law making spam punishable, the judges are simply supposed to apply that law, not to be influenced by their opinion about the law. (If a judge asserts a bias in the other direction, that's just as inappropriate, but that has been very rare.)
Well, shoot, I can't complainIf you feel you've been wronged, there is a Commission on Judicial Conduct in Washington for processing complaints against judges for improper behavior. For example, when a certain Judge Gary W. Velie got in trouble for saying "nuke the sand niggers" (referring to the first Iraq war), and for saying in court that a defendant had "gone crazy from sucking too many cocks" and telling another lawyer in court that he looked like he had been "jacking off a bobcat in a phone booth", the Commission flew (by judicial standards, meaning, a little over a year later) into action, and issued a reprimand. Evidently this was an exceptional situation, since the CJC takes action in response to only about 3% of submitted complaints in a typical year. Apparently the last time the CJC actually barred someone from office was in 2005, in the case of a judge who was convicted and imprisoned for molesting an 11-year-old boy. The Commission lists this decision as one of their accomplishments, although I think the judge probably wouldn't have been re-elected after that anyway.
Of the three test cases judges who got caught with the booby-trapped motions, two of them I thought were not really worse than most other judges anyway, but for the third one, I thought filing a complaint was probably justified. This was a case where I had telephoned the spammer before the trial, pretending to be an interested customer, and tape-recorded him making such statements as "Well, I would blast out 5 million for $500" and "It's a United-States-based company but they pump everything through China and then it comes back to the United States". At the trial, presided over by Judge Karlie Jorgensen, the spammer didn't know I was the guy from the phone call, so he claimed that he didn't even know how to send spam and had no idea what I was talking about, while Jorgensen kept Judge-Judying me in between just about every other sentence for picking on this obviously innocent man. After I brought out the recording, she became very flustered for a few moments and then started accusing me of "entrapment". (Entrapment, of course, is where you trick someone into doing something, and then sue them or arrest them for it. That wasn't the case here, since he spammed me first, and I called him afterwards just to get evidence that he was in the spamming business.) In the end she dismissed the case, and never said anything about the statements the spammer had made under oath.
So, that's when I filed my "motion to reconsider" with the pages stuck together, and after I got a letter that it had been denied (no kidding), I went to the courthouse and found the pages still attached. After the rest of the experiment was finished, I filed an official complaint with the Commission on Judicial Conduct saying that my motion had been rejected with the pages still stuck together, indicating the judge didn't read it. A little over a year later, I got a letter saying the complaint had been rejected.
Making a federal case out of itFortunately, there is a way to bring future spam suits in federal court, where several lawyers have suggested to me that I'm likely to get better results (with their help, naturally).
First though, I am of course aware that most spam can't be traced to the original sender to sue them, and that a lot of spam is sent by some Russian hacker or some loser in his Mom's basement who wouldn't be able to pay off a court judgment anyway. However, quite a bit of spam can be traced indirectly to companies that paid the spammer to send the spam or paid them for the leads that they generated, and those companies are usually easier to find and easier to collect against. For a while, every time I got a mortgage spam with a link to fill out a contact form, I would fill it out using a temporary phone number in a certain area code. Then I'd see which mortgage companies called me, and I'd call them back saying, "The person who sold you this lead is generated them illegally; you should stop buying leads from them, and should stop buying leads from people without asking where they came from." Then I'd wait until the next similar mortgage spam came in, fill out the form with a new phone number in the same area code, see which mortgage companies called me, and repeat.
Sometimes the mortgage brokers apologized and said they'd stop dealing with the person who sold them the lead. Others were unrepentant and started hanging up on me by the second or third time that I called them to tell them their latest batch of leads was generated by a spammer.
The Washington law lets you sue anyone who "sends, or conspires with another to send" spam if the person "knows, or consciously avoids knowing" that the spam violates the law. If I do file any future spam suits, what I'll probably do is use this method to find mortgage companies that refuse to stop buying leads from spammers, and then sue them for the cumulative liability for all the spam that I got from their lead generators. There are several advantages to doing it this way:
- Unethical mortgage companies are easier to locate, sue, and collect against, than most spammers.
- Rather than waiting for that rare spam that contains enough information to find and sue the spammer, you can almost always trace a mortgage spam to the company that is buying the leads, by filling it in with "bait" contact information.
- If you reach more than $75,000 worth of liability, you can sue in federal court. At least one good lawyer has said that if I built a case in this way against a spam-enabling mortgage company, he'd help file it for no up-front fee in exchange for a percentage of the winnings.
This last advantage is the big one. Whatever most media figures say in their rants against judges, what they usually don't mention is that there's a dividing line between judges at the state and federal levels: to be a federal judge, someone has to put their reputation on the line and nominate you. It's a horribly politicized process, but at least it's something. At the state level on the other hand, any lawyer who wants to be a judge can run for office -- and even then, for most judicial positions there is only one candidate. If we're so cynical about lawyers and politicians, why on Earth do we give a pass to judges, when a state-level judge is just a lawyer who ran for office? In fact, to be a "pro tem" judge, filling in for a day for the regular judge, you don't even have to win an election, you just take a class and then sign up for an available time slot.
Given the vastly greater seriousness of becoming a federal judge, I'll bet that if one of them had been handling the Karlie Jorgensen case, and the spammer said he "knew nothing about any spam" right before being confronted with a tape of his past conversations, maybe the judge wouldn't have sent him to jail for perjury, but the judge probably would have mentioned something about it. And if you had proof that a federal judge denied a motion without reading it, some cynics might not be surprised, but an official complaint at that level would probably be taken more seriously.
Besides, the nice thing about federal cases is that the defendant is likely to have a lawyer who will talk some sense into them and get them to settle out of court, instead of digging in their heels the way spammers often do in Small Claims. They say the best lawyer isn't the one who wins in court but the one who keeps the case from going before the judge at all, and I'm sure that's true even with federal judges. By that standard, I hope that every spammer that I sue in federal court, has a fantastic lawyer.
-
Censorware Not Good, Just Better Than COPA
Slashdot contributor Bennett Haselton writes in with with an essay that starts "On March 22nd, District Court Judge Lowell Reed ruled that the Child Online Protection Act was unconstitutional, partly because the judge called it 'vague and overbroad,' and partly because less restrictive means existed, such as Internet blocking software. I'll leave others to comment on the legal issues, but blocking software is something that I've studied, and it's important to make sure this decision is not seen as some kind of vindication for the 'censorware' industry." Tap that link below to read the rest of his story.The thrust of the judge's findings about blocking software was that it blocks a high proportion of pornography, blocks a low proportion of non-pornographic Web sites, and that it is difficult for most kids get around. I think that these conclusions are correct for the purpose of the decision he was making -- in other words, blocking software blocks a high proportion of pornography compared to the law in question, and is difficult to get around compared to the law in question. But let's not get carried away -- blocking software is not that accurate, and not that hard to defeat.
Consider first the accuracy rates cited by the judge. Citing expert witness reports, he wrote, "I find that filters generally block about 95% of sexually explicit material", and then quoted several different rates for overblocking provided by expert witness reports, ranging from about 4% to 11%. I wrote earlier about the different ways to interpret overblocking error rates -- the gist was that if you care about the constitutional issues with filter use, then you look at the percentage of blocked sites that are non-pornographic (i.e. for every porn site that gets blocked, how many research sites get canned along with it), and that number tends to be high. On the other hand, if you simply care about the effectiveness of blocking software in a home setting where there is no constitutional issue raised, then you look at the percentage of non-pornographic sites that are blocked, and that number tends to be low.
For example, suppose for the sake of argument that 1% of Web sites in a given sample are sexually explicit, or 100 Web sites out of 10,000. To use Judge Reed's numbers, suppose that 95% of those porn sites, or exactly 95 in this sample, are blocked, whereas of the other 9,900 sites, 5%, or exactly 495 of them, are not blocked. Then the percentage of non-porn sites that are blocked is only 5%, but the percentage of blocked sites that are non-porn is actually 83% (495 blocked non-porn sites, out of a total of 495+95=590 blocked sites). One of our past studies of blocking software did indeed sometimes find error rates of about 80%, due to errors caused by IP address blocking and filters being tripped up by keywords (even when "keyword blocking" features were supposedly turned off -- because in that case the program still blocked sites on its master blacklist, and those blacklists are frequently built by scanning the Web for keywords).
Another portion of the judge's ruling dealt with the difficulty of getting around blocking software:
Filtering companies actively take steps to make sure that children are not able to come up with ways to circumvent their filters. Filtering companies monitor the Web to identify any methods for circumventing filters, and when such methods are found, the filtering companies respond by putting in extra protections in an attempt to make sure that those methods do not succeed with their products... It is difficult for children to circumvent filters because of the technical ability and expertise necessary to do so by disabling the product on the actual computer or by accessing the Web through a proxy or intermediary computer and successfully avoiding a filter on the minor's computer... Accessing the Web through a proxy or intermediary computer will not enable a minor to avoid a filtering product that analyzes the content of the Web page requested, in addition to where the page is coming from. Any product that contains a real-time, dynamic filtering component cannot be avoided by use of a proxy, whether the filter is located on the network or on the user's computer.
After the ruling came out, I tried some of the best-known blocking software programs to see how easily they could be defeated: Net Nanny, SurfControl, CyberSitter, and AOL Parental Controls. Net Nanny and SurfControl apparently could not block https:// sites at all, so I was able to get to https://www.StupidCensorship.com/ and access anything I wanted from there, despite the fact that that site had been public for over a year. Apparently I do have the "technical ability and expertise necessary" to "access the Web through a proxy", but then again I'm not a minor, so, kids, don't hurt yourself trying that.CyberSitter did intercept the https:// request so it did block StupidCensorship.com, but it didn't know about some of the other proxy sites that we had mailed out to our users recently. One of those did however get blocked because the word "hacking" appeared on the page -- as in,
This site is a tool for circumventing Internet censorship to promote free speech. It does not enable any hacking, cracking or any illegal activities (since it doesn't let you to access any sites that you couldn't access from home anyway).
so it's probably safe to say that if the CyberSitter filter is that paranoid, it would result in a good deal of overblocking as well. AOL Parental Controls also did not block the latest proxies, although it wouldn't let me load sites like Playboy through the proxy, presumably because it recognized the contents of the page and blocked it (so on that point, Judge Reed was right).But none of the products could stop the doomsday weapon, which is to burn an Ubuntu Linux CD and boot from that, bypassing any security software installed under Windows. I can see your eyes glazing over at the thought of kids attempting to do that, but it's merely an unfamiliar process to most people, not actually difficult. (I've been saying for years, that with the greater difficulty of using Linux over Windows, there's nothing cool or clever about running it just for its own sake so you can feel badass, and the only time you need it is if you want to do something that only Linux lets you do. Well, here's something!)
But in spite of everything, I think the judge's conclusions about blocking software were still broadly correct, because he was comparing the merits of blocking software against the merits of a law that would have prohibited commercial pornography from being published on the Web in the United States. In talking about the "effectiveness" of such a law, the judge and lawyers cited the fact that as many as 75% of adult sites were hosted overseas anyway. But even that high number understates the situation, because hypothetically if all the porn on the Web in the U.S. did get outlawed, it would be easy for anyone to spend all their time looking at porn from outside the country. When you're talking about a supply of content that is so large that nobody could finish looking at it all if they spent the rest of their life trying, it doesn't really matter if 25% or 50% or 75% is located within your legal jurisdiction. I never stop hoping that a judge will say, "Look, pictures of naked people don't hurt anyone, no, not even people under 18. Shoot, when I was 13 and president of Future Lawyers of America, my friend gave me a copy of Playboy as a down payment for my unsuccessful attempts to defend him on curfew-breaking charges in Foot v. Ass, and look how I turned out." But even a judge who firmly believed that people under 18 were harmed by pornographic images, would have found little reason to uphold this law.
-
Censorware Not Good, Just Better Than COPA
Slashdot contributor Bennett Haselton writes in with with an essay that starts "On March 22nd, District Court Judge Lowell Reed ruled that the Child Online Protection Act was unconstitutional, partly because the judge called it 'vague and overbroad,' and partly because less restrictive means existed, such as Internet blocking software. I'll leave others to comment on the legal issues, but blocking software is something that I've studied, and it's important to make sure this decision is not seen as some kind of vindication for the 'censorware' industry." Tap that link below to read the rest of his story.The thrust of the judge's findings about blocking software was that it blocks a high proportion of pornography, blocks a low proportion of non-pornographic Web sites, and that it is difficult for most kids get around. I think that these conclusions are correct for the purpose of the decision he was making -- in other words, blocking software blocks a high proportion of pornography compared to the law in question, and is difficult to get around compared to the law in question. But let's not get carried away -- blocking software is not that accurate, and not that hard to defeat.
Consider first the accuracy rates cited by the judge. Citing expert witness reports, he wrote, "I find that filters generally block about 95% of sexually explicit material", and then quoted several different rates for overblocking provided by expert witness reports, ranging from about 4% to 11%. I wrote earlier about the different ways to interpret overblocking error rates -- the gist was that if you care about the constitutional issues with filter use, then you look at the percentage of blocked sites that are non-pornographic (i.e. for every porn site that gets blocked, how many research sites get canned along with it), and that number tends to be high. On the other hand, if you simply care about the effectiveness of blocking software in a home setting where there is no constitutional issue raised, then you look at the percentage of non-pornographic sites that are blocked, and that number tends to be low.
For example, suppose for the sake of argument that 1% of Web sites in a given sample are sexually explicit, or 100 Web sites out of 10,000. To use Judge Reed's numbers, suppose that 95% of those porn sites, or exactly 95 in this sample, are blocked, whereas of the other 9,900 sites, 5%, or exactly 495 of them, are not blocked. Then the percentage of non-porn sites that are blocked is only 5%, but the percentage of blocked sites that are non-porn is actually 83% (495 blocked non-porn sites, out of a total of 495+95=590 blocked sites). One of our past studies of blocking software did indeed sometimes find error rates of about 80%, due to errors caused by IP address blocking and filters being tripped up by keywords (even when "keyword blocking" features were supposedly turned off -- because in that case the program still blocked sites on its master blacklist, and those blacklists are frequently built by scanning the Web for keywords).
Another portion of the judge's ruling dealt with the difficulty of getting around blocking software:
Filtering companies actively take steps to make sure that children are not able to come up with ways to circumvent their filters. Filtering companies monitor the Web to identify any methods for circumventing filters, and when such methods are found, the filtering companies respond by putting in extra protections in an attempt to make sure that those methods do not succeed with their products... It is difficult for children to circumvent filters because of the technical ability and expertise necessary to do so by disabling the product on the actual computer or by accessing the Web through a proxy or intermediary computer and successfully avoiding a filter on the minor's computer... Accessing the Web through a proxy or intermediary computer will not enable a minor to avoid a filtering product that analyzes the content of the Web page requested, in addition to where the page is coming from. Any product that contains a real-time, dynamic filtering component cannot be avoided by use of a proxy, whether the filter is located on the network or on the user's computer.
After the ruling came out, I tried some of the best-known blocking software programs to see how easily they could be defeated: Net Nanny, SurfControl, CyberSitter, and AOL Parental Controls. Net Nanny and SurfControl apparently could not block https:// sites at all, so I was able to get to https://www.StupidCensorship.com/ and access anything I wanted from there, despite the fact that that site had been public for over a year. Apparently I do have the "technical ability and expertise necessary" to "access the Web through a proxy", but then again I'm not a minor, so, kids, don't hurt yourself trying that.CyberSitter did intercept the https:// request so it did block StupidCensorship.com, but it didn't know about some of the other proxy sites that we had mailed out to our users recently. One of those did however get blocked because the word "hacking" appeared on the page -- as in,
This site is a tool for circumventing Internet censorship to promote free speech. It does not enable any hacking, cracking or any illegal activities (since it doesn't let you to access any sites that you couldn't access from home anyway).
so it's probably safe to say that if the CyberSitter filter is that paranoid, it would result in a good deal of overblocking as well. AOL Parental Controls also did not block the latest proxies, although it wouldn't let me load sites like Playboy through the proxy, presumably because it recognized the contents of the page and blocked it (so on that point, Judge Reed was right).But none of the products could stop the doomsday weapon, which is to burn an Ubuntu Linux CD and boot from that, bypassing any security software installed under Windows. I can see your eyes glazing over at the thought of kids attempting to do that, but it's merely an unfamiliar process to most people, not actually difficult. (I've been saying for years, that with the greater difficulty of using Linux over Windows, there's nothing cool or clever about running it just for its own sake so you can feel badass, and the only time you need it is if you want to do something that only Linux lets you do. Well, here's something!)
But in spite of everything, I think the judge's conclusions about blocking software were still broadly correct, because he was comparing the merits of blocking software against the merits of a law that would have prohibited commercial pornography from being published on the Web in the United States. In talking about the "effectiveness" of such a law, the judge and lawyers cited the fact that as many as 75% of adult sites were hosted overseas anyway. But even that high number understates the situation, because hypothetically if all the porn on the Web in the U.S. did get outlawed, it would be easy for anyone to spend all their time looking at porn from outside the country. When you're talking about a supply of content that is so large that nobody could finish looking at it all if they spent the rest of their life trying, it doesn't really matter if 25% or 50% or 75% is located within your legal jurisdiction. I never stop hoping that a judge will say, "Look, pictures of naked people don't hurt anyone, no, not even people under 18. Shoot, when I was 13 and president of Future Lawyers of America, my friend gave me a copy of Playboy as a down payment for my unsuccessful attempts to defend him on curfew-breaking charges in Foot v. Ass, and look how I turned out." But even a judge who firmly believed that people under 18 were harmed by pornographic images, would have found little reason to uphold this law.
-
Censorware Not Good, Just Better Than COPA
Slashdot contributor Bennett Haselton writes in with with an essay that starts "On March 22nd, District Court Judge Lowell Reed ruled that the Child Online Protection Act was unconstitutional, partly because the judge called it 'vague and overbroad,' and partly because less restrictive means existed, such as Internet blocking software. I'll leave others to comment on the legal issues, but blocking software is something that I've studied, and it's important to make sure this decision is not seen as some kind of vindication for the 'censorware' industry." Tap that link below to read the rest of his story.The thrust of the judge's findings about blocking software was that it blocks a high proportion of pornography, blocks a low proportion of non-pornographic Web sites, and that it is difficult for most kids get around. I think that these conclusions are correct for the purpose of the decision he was making -- in other words, blocking software blocks a high proportion of pornography compared to the law in question, and is difficult to get around compared to the law in question. But let's not get carried away -- blocking software is not that accurate, and not that hard to defeat.
Consider first the accuracy rates cited by the judge. Citing expert witness reports, he wrote, "I find that filters generally block about 95% of sexually explicit material", and then quoted several different rates for overblocking provided by expert witness reports, ranging from about 4% to 11%. I wrote earlier about the different ways to interpret overblocking error rates -- the gist was that if you care about the constitutional issues with filter use, then you look at the percentage of blocked sites that are non-pornographic (i.e. for every porn site that gets blocked, how many research sites get canned along with it), and that number tends to be high. On the other hand, if you simply care about the effectiveness of blocking software in a home setting where there is no constitutional issue raised, then you look at the percentage of non-pornographic sites that are blocked, and that number tends to be low.
For example, suppose for the sake of argument that 1% of Web sites in a given sample are sexually explicit, or 100 Web sites out of 10,000. To use Judge Reed's numbers, suppose that 95% of those porn sites, or exactly 95 in this sample, are blocked, whereas of the other 9,900 sites, 5%, or exactly 495 of them, are not blocked. Then the percentage of non-porn sites that are blocked is only 5%, but the percentage of blocked sites that are non-porn is actually 83% (495 blocked non-porn sites, out of a total of 495+95=590 blocked sites). One of our past studies of blocking software did indeed sometimes find error rates of about 80%, due to errors caused by IP address blocking and filters being tripped up by keywords (even when "keyword blocking" features were supposedly turned off -- because in that case the program still blocked sites on its master blacklist, and those blacklists are frequently built by scanning the Web for keywords).
Another portion of the judge's ruling dealt with the difficulty of getting around blocking software:
Filtering companies actively take steps to make sure that children are not able to come up with ways to circumvent their filters. Filtering companies monitor the Web to identify any methods for circumventing filters, and when such methods are found, the filtering companies respond by putting in extra protections in an attempt to make sure that those methods do not succeed with their products... It is difficult for children to circumvent filters because of the technical ability and expertise necessary to do so by disabling the product on the actual computer or by accessing the Web through a proxy or intermediary computer and successfully avoiding a filter on the minor's computer... Accessing the Web through a proxy or intermediary computer will not enable a minor to avoid a filtering product that analyzes the content of the Web page requested, in addition to where the page is coming from. Any product that contains a real-time, dynamic filtering component cannot be avoided by use of a proxy, whether the filter is located on the network or on the user's computer.
After the ruling came out, I tried some of the best-known blocking software programs to see how easily they could be defeated: Net Nanny, SurfControl, CyberSitter, and AOL Parental Controls. Net Nanny and SurfControl apparently could not block https:// sites at all, so I was able to get to https://www.StupidCensorship.com/ and access anything I wanted from there, despite the fact that that site had been public for over a year. Apparently I do have the "technical ability and expertise necessary" to "access the Web through a proxy", but then again I'm not a minor, so, kids, don't hurt yourself trying that.CyberSitter did intercept the https:// request so it did block StupidCensorship.com, but it didn't know about some of the other proxy sites that we had mailed out to our users recently. One of those did however get blocked because the word "hacking" appeared on the page -- as in,
This site is a tool for circumventing Internet censorship to promote free speech. It does not enable any hacking, cracking or any illegal activities (since it doesn't let you to access any sites that you couldn't access from home anyway).
so it's probably safe to say that if the CyberSitter filter is that paranoid, it would result in a good deal of overblocking as well. AOL Parental Controls also did not block the latest proxies, although it wouldn't let me load sites like Playboy through the proxy, presumably because it recognized the contents of the page and blocked it (so on that point, Judge Reed was right).But none of the products could stop the doomsday weapon, which is to burn an Ubuntu Linux CD and boot from that, bypassing any security software installed under Windows. I can see your eyes glazing over at the thought of kids attempting to do that, but it's merely an unfamiliar process to most people, not actually difficult. (I've been saying for years, that with the greater difficulty of using Linux over Windows, there's nothing cool or clever about running it just for its own sake so you can feel badass, and the only time you need it is if you want to do something that only Linux lets you do. Well, here's something!)
But in spite of everything, I think the judge's conclusions about blocking software were still broadly correct, because he was comparing the merits of blocking software against the merits of a law that would have prohibited commercial pornography from being published on the Web in the United States. In talking about the "effectiveness" of such a law, the judge and lawyers cited the fact that as many as 75% of adult sites were hosted overseas anyway. But even that high number understates the situation, because hypothetically if all the porn on the Web in the U.S. did get outlawed, it would be easy for anyone to spend all their time looking at porn from outside the country. When you're talking about a supply of content that is so large that nobody could finish looking at it all if they spent the rest of their life trying, it doesn't really matter if 25% or 50% or 75% is located within your legal jurisdiction. I never stop hoping that a judge will say, "Look, pictures of naked people don't hurt anyone, no, not even people under 18. Shoot, when I was 13 and president of Future Lawyers of America, my friend gave me a copy of Playboy as a down payment for my unsuccessful attempts to defend him on curfew-breaking charges in Foot v. Ass, and look how I turned out." But even a judge who firmly believed that people under 18 were harmed by pornographic images, would have found little reason to uphold this law.
-
Censorware Not Good, Just Better Than COPA
Slashdot contributor Bennett Haselton writes in with with an essay that starts "On March 22nd, District Court Judge Lowell Reed ruled that the Child Online Protection Act was unconstitutional, partly because the judge called it 'vague and overbroad,' and partly because less restrictive means existed, such as Internet blocking software. I'll leave others to comment on the legal issues, but blocking software is something that I've studied, and it's important to make sure this decision is not seen as some kind of vindication for the 'censorware' industry." Tap that link below to read the rest of his story.The thrust of the judge's findings about blocking software was that it blocks a high proportion of pornography, blocks a low proportion of non-pornographic Web sites, and that it is difficult for most kids get around. I think that these conclusions are correct for the purpose of the decision he was making -- in other words, blocking software blocks a high proportion of pornography compared to the law in question, and is difficult to get around compared to the law in question. But let's not get carried away -- blocking software is not that accurate, and not that hard to defeat.
Consider first the accuracy rates cited by the judge. Citing expert witness reports, he wrote, "I find that filters generally block about 95% of sexually explicit material", and then quoted several different rates for overblocking provided by expert witness reports, ranging from about 4% to 11%. I wrote earlier about the different ways to interpret overblocking error rates -- the gist was that if you care about the constitutional issues with filter use, then you look at the percentage of blocked sites that are non-pornographic (i.e. for every porn site that gets blocked, how many research sites get canned along with it), and that number tends to be high. On the other hand, if you simply care about the effectiveness of blocking software in a home setting where there is no constitutional issue raised, then you look at the percentage of non-pornographic sites that are blocked, and that number tends to be low.
For example, suppose for the sake of argument that 1% of Web sites in a given sample are sexually explicit, or 100 Web sites out of 10,000. To use Judge Reed's numbers, suppose that 95% of those porn sites, or exactly 95 in this sample, are blocked, whereas of the other 9,900 sites, 5%, or exactly 495 of them, are not blocked. Then the percentage of non-porn sites that are blocked is only 5%, but the percentage of blocked sites that are non-porn is actually 83% (495 blocked non-porn sites, out of a total of 495+95=590 blocked sites). One of our past studies of blocking software did indeed sometimes find error rates of about 80%, due to errors caused by IP address blocking and filters being tripped up by keywords (even when "keyword blocking" features were supposedly turned off -- because in that case the program still blocked sites on its master blacklist, and those blacklists are frequently built by scanning the Web for keywords).
Another portion of the judge's ruling dealt with the difficulty of getting around blocking software:
Filtering companies actively take steps to make sure that children are not able to come up with ways to circumvent their filters. Filtering companies monitor the Web to identify any methods for circumventing filters, and when such methods are found, the filtering companies respond by putting in extra protections in an attempt to make sure that those methods do not succeed with their products... It is difficult for children to circumvent filters because of the technical ability and expertise necessary to do so by disabling the product on the actual computer or by accessing the Web through a proxy or intermediary computer and successfully avoiding a filter on the minor's computer... Accessing the Web through a proxy or intermediary computer will not enable a minor to avoid a filtering product that analyzes the content of the Web page requested, in addition to where the page is coming from. Any product that contains a real-time, dynamic filtering component cannot be avoided by use of a proxy, whether the filter is located on the network or on the user's computer.
After the ruling came out, I tried some of the best-known blocking software programs to see how easily they could be defeated: Net Nanny, SurfControl, CyberSitter, and AOL Parental Controls. Net Nanny and SurfControl apparently could not block https:// sites at all, so I was able to get to https://www.StupidCensorship.com/ and access anything I wanted from there, despite the fact that that site had been public for over a year. Apparently I do have the "technical ability and expertise necessary" to "access the Web through a proxy", but then again I'm not a minor, so, kids, don't hurt yourself trying that.CyberSitter did intercept the https:// request so it did block StupidCensorship.com, but it didn't know about some of the other proxy sites that we had mailed out to our users recently. One of those did however get blocked because the word "hacking" appeared on the page -- as in,
This site is a tool for circumventing Internet censorship to promote free speech. It does not enable any hacking, cracking or any illegal activities (since it doesn't let you to access any sites that you couldn't access from home anyway).
so it's probably safe to say that if the CyberSitter filter is that paranoid, it would result in a good deal of overblocking as well. AOL Parental Controls also did not block the latest proxies, although it wouldn't let me load sites like Playboy through the proxy, presumably because it recognized the contents of the page and blocked it (so on that point, Judge Reed was right).But none of the products could stop the doomsday weapon, which is to burn an Ubuntu Linux CD and boot from that, bypassing any security software installed under Windows. I can see your eyes glazing over at the thought of kids attempting to do that, but it's merely an unfamiliar process to most people, not actually difficult. (I've been saying for years, that with the greater difficulty of using Linux over Windows, there's nothing cool or clever about running it just for its own sake so you can feel badass, and the only time you need it is if you want to do something that only Linux lets you do. Well, here's something!)
But in spite of everything, I think the judge's conclusions about blocking software were still broadly correct, because he was comparing the merits of blocking software against the merits of a law that would have prohibited commercial pornography from being published on the Web in the United States. In talking about the "effectiveness" of such a law, the judge and lawyers cited the fact that as many as 75% of adult sites were hosted overseas anyway. But even that high number understates the situation, because hypothetically if all the porn on the Web in the U.S. did get outlawed, it would be easy for anyone to spend all their time looking at porn from outside the country. When you're talking about a supply of content that is so large that nobody could finish looking at it all if they spent the rest of their life trying, it doesn't really matter if 25% or 50% or 75% is located within your legal jurisdiction. I never stop hoping that a judge will say, "Look, pictures of naked people don't hurt anyone, no, not even people under 18. Shoot, when I was 13 and president of Future Lawyers of America, my friend gave me a copy of Playboy as a down payment for my unsuccessful attempts to defend him on curfew-breaking charges in Foot v. Ass, and look how I turned out." But even a judge who firmly believed that people under 18 were harmed by pornographic images, would have found little reason to uphold this law.
-
Yes Virginia, ISPs Have Silently Blocked Web Sites
Slashdot contributor Bennett Haselton writes "A recurring theme in editorials about Net Neutrality -- broadly defined as the principle that ISPs may not block or degrade access to sites based on their content or ownership (with exceptions for clearly delineated services like parental controls) -- is that it is a "solution in search of a problem", that ISPs in the free world have never actually blocked legal content on purpose. True, the movement is mostly motivated by statements by some ISPs about what they might do in the future, such as slow down customers' access to sites if the sites haven't paid a fast-lane "toll". But there was also an oft-forgotten episode in 2000 when it was revealed that two backbone providers, AboveNet and TeleGlobe, had been blocking users' access to certain Web sites for over a year -- not due to a configuration error, but by the choice of management within those companies. Maybe I'm biased, since one of the Web sites being blocked was mine. But I think this incident is more relevant than ever now -- not just because it shows that prolonged violations of Net Neutrality can happen, but because some of the people who organized or supported AboveNet's Web filtering, are people in fairly influential positions today, including the head of the Internet Systems Consortium, the head of the IRTF's Anti-Spam Research Group, and the operator of Spamhaus. Which begs the question: If they really believe that backbone companies have the right to silently block Web sites, are some of them headed for a rift with Net Neutrality supporters?" Read on for the rest of his story.In the aforementioned instance, AboveNet and TeleGlobe were not selling "parental filters" or other common types of filtered Internet access; the users being blocked from our Web sites were adults paying for what they thought were unfiltered Internet connections. What had happened was that AboveNet and TeleGlobe signed up to block Web sites on the Realtime Blackhole List, a list which was widely (but inaccurately) thought to be a list of "spammers", put out by a group called the Mail Abuse Prevention System. (MAPS and the RBL still exist, but under new management and in a form that bears little resemblance to their late-90's forerunners.) Most ISPs that used the RBL used it to filter only incoming e-mail, but AboveNet went all-out and blocked users from even viewing RBL'ed web sites, presumably because two of MAPS's founders, Paul Vixie and Dave Rand, were on the AboveNet board of directors. And it turned out that the RBL not only included spammers, but also Web sites that were not sending mail at all but were blocked because of their content -- in our case, our ISP got blocked because some other customers were selling mailing list software that MAPS believed could be too easily abused by spammers.
These two distinctions -- (1) the distinction between blocking incoming e-mail from spammers, versus blocking Web sites; and (2) the distinction between blocking traffic due to spam activity, versus blocking sites because of their content -- both go to the heart of what Net Neutrality is, and isn't, about. Net Neutrality is about user preferences -- not meaning that as a buzzword, but as an actual guiding principle to figure out what is and is not covered by the cause. If an ISP filters incoming mail from known spammers, that generally improves the user experience, and is something many users would expect an ISP to do anyway. But if an ISP blocks users from reaching Web sites (even, for the sake of argument, the Web sites of actual spammers), then that's generally counteracting the user's wishes -- if the user didn't want to go there, they wouldn't have typed it in. (After all, I visit spammers' Web sites all the time, usually right before I sue them.) Similarly, if an ISP blocks traffic from sites because of spam or other network abuse, that serves to protect their own users. But if an ISP blocks users from viewing sites because of their content, that's generally not expected by users, unless they've specifically signed up for something like parental controls. The Snowe Net Neutrality amendment proposed last year recognized both of these distinctions, and stated that nothing in the amendment would be interpreted to prohibit spam filtering, parental control services, or measures to protect network security.
The MAPS incident thus shaped most of my opinions about Net Neutrality 6 years before the debate even had a name. When I first found out in August 2000 that our ISP was blacklisted, like most people I believed that the RBL really was a list of spammers; after all the MAPS web page said that the RBL was a list of networks that "originate or relay spam". So I called my ISP screaming at them for being incompetent spam-enablers (the culmination of many frustrating issues with them), and saying that if they really were letting customers send spam, or running an insecure server that spammers were hijacking, I would leave on principle, if the cretins managing our server didn't drop it in the lake first. The ISP owner then told me what happened: that the ISP was not blacklisted for spamming customers, but because of the content of the other sites. (Buried in the list of RBL criteria on MAPS's site was the statement that sites could be blacklisted for providing "spam software", although the criteria did not define how they distinguished between spam software and regular mailing list software, which is how our ISP got caught in the net. And the criteria did not disclose anywhere the most controversial feature of the RBL, which is that if an ISP didn't comply, MAPS would start blacklisting other unrelated sites at the same ISP to put more pressure on them.) I agreed that this seemed to be absurd, and said I wouldn't leave the ISP if they were being blackballed just because of the content of hosted pages.
I don't know exactly what the mail software in question did or where MAPS thought the line should be drawn, but I am a purist about content -- it's a long-standing principle among the Internet security community that if a tool exists which exploits a security hole, you don't try to make the software disappear, you fix the hole. And besides, since MAPS and their supporters wanted to blackball ISPs that hosted spamming software (however you defined that), but the same people had never advocated blackballing ISPs that hosted network break-in tools and other cracking programs, for example, then what were they really saying? That spamming someone more unethical than breaking into their network?
But by far the most common objection to my complaint about AboveNet blocking Web sites was, "Hey, if a private company blocks things, as long as they're being honest to their users about it, who cares?" Well, true, but the fact that AboveNet blocked Web sites was not widely known even within the company; when I once called AboveNet feigning ignorance and asking them if they blocked RBL'ed Web sites, the technician who spoke to me said, "No, that wouldn't make any sense." (Well, half right.) Their AUP mentioned "protecting users from spam" but said nothing about blocking Web sites. In fact, other than "family-filtered" ISPs and similar services, I've never heard of any company blocking Web sites that actually did try to make their users aware of it. (On the other hand, even if AboveNet had fully disclosed their filtering, they were still a backbone company selling connectivity mainly to ISPs -- and I think if you sell something wholesale that can only be re-sold to the public by fraudulent means, then you're at least partly complicit in that fraud as well.)
If you're tempted to argue that backbone providers should be allowed to block whatever they want as long as they bury it in their AUP (although AboveNet and TeleGlobe didn't even do that much), just consider: When you access Google from your home computer, have you read the AUP of every network that the packets pass through, to check whether they reserve the right to block or even modify your traffic? Without doing a traceroute, could you even name all the networks that the traffic passes through? Do you really want the burden to be on you to check with all of them every time there's a problem reaching a Web site? Or do you feel like there's an understanding that as long as you pay your bill, they should let you go wherever you want?
Some have argued that if an ISP blocks the user from reaching a Web site, then even if the ISP is defrauding the user, that's still strictly an issue between the user and the ISP. But if a user is trying to reach your Web site, the user is trying to give you something of value: their attention, their eyeballs on your advertisements, sometimes even their money (with the expectation that you will provide them with something in return, of course, like some content worth reading). If the ISP steps in and blocks that, then the ISP has taken something of value that the user was attempting to give to you, and diverted it to serve their own interests. To me that doesn't seem ethically much different from the FedEx driver swiping the chocolates that someone tried to send you for Valentine's Day. Is that just between the sender and FedEx? Or do you have a beef because you didn't get the present that was intended for you, and you had to eat last week's chocolates to cheer up?
The modern-day threats to Net Neutrality are different: slowing access to Web sites unless the site owners pay a "toll", instead of blocking access to sites because of the content of other sites hosted at the same ISP. But they both boil down to the same thing: not giving end users what they have already paid for. If a user buys Internet access, they almost always buy it with the understanding that if they access a site, the content will download as quickly as their connection allows.
Thus the most common misconception about Net Neutrality is that the proponents are fighting against "capitalism" -- ISPs just charging more for different delivery speeds. But ISPs are already charging users for those delivery lines -- including different tiers for different prices. That's capitalism, and it works, with prices falling all the time in a fairly competitive market. But charging publishers for those higher delivery speeds to the user's house, is really more like double-billing, because the user has already been charged once for the lines that the content is coming over, so the ISP is trying to charge the content publisher again for the same service. Of course, if you charge party A for doing X, and then you try to charge party B for the same instance of doing X, and party B doesn't pay up so you don't do X, you're also breaking your deal with A. Brad Templeton of the EFF stated as much on his blog in 2006:
The pipes start off belonging to the ISPs but they sell them to their customers. The customers are buying their line to the middle, where they meet the line from the other user or site they want to talk to. The problem is generated because the carriers all price the lines at lower than they might have to charge if they were all fully saturated, since most users only make limited, partial use of the lines. When new apps increase the amount a typical user needs, it alters the economics of the ISP. They could deal with that by raising prices and really delivering the service they only pretend to sell, or by charging the other end, and breaking the cost contract. They've rattled sabres about doing the latter.
And I think the same is clearly true if, instead of trying to extract money from the content publisher, the ISP tries to extract something else, like an agreement to shut down certain Web sites before the ISP will let their users view other sites hosted at the same company. You can talk all day about how evil those Web sites are, but the ISP has already sold the user a connection with the implied ability to access them.Anyway, this all came out in 2000 when a Slashdot article revealed that AboveNet had been blocking Web sites, and AboveNet stopped doing it two hours after the article came out. (TeleGlobe stuck with it for a few more months.) But from the hostility of the reaction, you'd think that we had published cartoons in a Danish newspaper showing Paul Vixie with a bomb in his turban. I got more e-mails than I could count arguing that AboveNet had the right to block whatever Web sites they felt like, regardless of whether the end users knew it was happening. To those people, I'd be sincerely interested in their answer to this question: Does that mean they've have no problem if they found out their ISP was silently blocking sites for political reasons? There is a clear line between following user preferences by blocking spam, and countermanding user preferences by blocking sites because of their content -- and once you've crossed that line, where's the logical stopping point? Seriously, I would have liked to have known how they would answer that, if I could have gotten any meaningful dialog going with them, which most of the time I couldn't. At the time, I'd just spent four years telling people that kids looking at porn was a non-issue, and that by the way if their kids came to my Web site I'd even help them get around their blocking software, and I still got more angry e-mails for disclosing the fact that AboveNet blocked Web sites based on their content, than I'd gotten in all the previous four years combined. (A few even accused us of moving into a blacklisted address block on purpose. This was because the actual move happened after the blacklisting was in place, even though I told them all that our ISP had announced the coming move two months before -- repeat, before -- they ever heard from MAPS. Some people were so in love with that "smoking gun" that they didn't believe me; that's their prerogative. But don't take my word for it -- when one supporter wrote to MAPS to ask about un-blocking our site, MAPS officer Kelly Thompson replied:
>Would it be possible to
It was MAPS's decision, not ours or our ISP's, to have our site blocked. That should settle that once and for all, just as soon as there is peace in the Middle East and a black lesbian in the White House.)
>selectively unblock peacefire.org (209.211.253.169)?
Technically? Yes, it is. It's a violation of our policy, though, so I can't do so.
I would be willing to help you find other free or reduced cost hosting, however.
But what do all these people think about Net Neutrality, 6 years later? I tried to track down the influential people who had spoken out supporting AboveNet's blocking of Web sites, or at least their right to block Web sites. My position was, we can agree to disagree on that, but if they really feel that way, why haven't they been speaking out against Net Neutrality? The proposed Snowe amendment was pretty clear:
SEC. 12. INTERNET NEUTRALITY
(a) Duty of Broadband Service Providers- With respect to any broadband service offered to the public, each broadband service provider shall--
(1) not block, interfere with, discriminate against, impair, or degrade the ability of any person to use a broadband service to access, use, send, post, receive, or offer any lawful content, application, or service made available via the Internet.John Levine, webmaster of Abuse.Net, head of the IRTF's Anti-Spam Research Group, and one of the most vocal critics of Peacefire's campaign against AboveNet's Web filtering, said that he would have opposed the bill but didn't bother because it didn't have much chance of passing. Well, it didn't, but the bill was significant not because of its likelihood of passage, but because it articulated the principles that the Net Neutrality coalition had rallied around, and with the momentum behind the movement, it's likely to achieve at least some of its goals, by legislation or otherwise.
Paul Vixie, Dave Rand, and Steve Linford did not respond to requests for comment on Net Neutrality. But Paul Vixie wrote something very interesting in a May 2006 blog post:
Second, there's network neutrality. In telephone service, the government mandates that all companies providing voice-grade telephony interconnect with eachother at preset rates, thus ensuring that any phone can call any other phone and that new phone companies can enter the field to help ensure competition. In Internet service, the government mandates nothing. Recently SBC (I mean AT&T, I think, is it Wednesday?) rattled its sabre and said that Google and other content supplying companies should be paying for the use of SBC's backbone to reach SBC's eyeballs. Most of us said, uh, what? "Aren't SBC's own customers paying SBC to carry that traffic?" Some of us even said "I am not an eyeball, I am a person!" But anyway, from time to time these Internet companies shut down interconnects in hopes of creating new cash flows among eachother, and until the government regulates this, we're all at risk of higher prices or lower service with zero notice. Some well meaning democrats are trying to challenge this with "network neutrality" legislation, but this probably isn't their year. Or their decade.
San Francisco has a government, though. And if San Francisco owned and operated its own wireless Internet plant, we could mandate that any Internet company wishing to do business in this city interconnect at fair and reasonable cost to all other Internet companies wishing to do business in this city.
"Until the government regulates this"? "Government mandates"? "Fair and reasonable cost"? Quick, call the anti-socialist intervention squad! How long does it take those San Francisco hippies to suck the new arrivals' brains out anyway? Of course, I agree with everything he said. It's just that if you replace "create new cash flows" with "try to get ISPs to remove content from their servers", this describes exactly what Vixie and AboveNet were doing a few years earlier. He's a smart guy, and I'm sure this didn't escape his sense of irony, so perhaps this confirms something I'd suspected all along, which is that Vixie understood the subtleties of the issue better than most of his cheerleaders, and may be having second thoughts about AboveNet's Web-blocking misadventure. From the beginning, in a 1997 interview with Sun World, he sounded like someone trying to at least keep an open mind:
Concentration of power into a single individual: It's very true that power has corrupted every individual in whom it has ever been concentrated in the history of mankind. I do not feel that I am necessarily above whatever elements of human nature give rise to that. I worry about it. Probably other people worry about it more than I do.
Although, he didn't get to making any such frank statements during the controversy over AboveNet's Web site blocking. (Perhaps MAPS's lawyers were worried that he was a little too unfiltered and advised him not to comment; at the time, the MAPS Web site had a "How to sue MAPS" link on the front page.)Speaking of which, Anne Mitchell, Director of Legal and Public Affairs for MAPS during the time when AboveNet was blocking Web sites, was the only MAPS adherent from the era that I could find who has since clearly and publicly come out against Net Neutrality. In May 2006 she wrote:
Here's the thing that the 3Ns (Net Neutrality Nuts) don't get: bandwidth costs money. And if you can't charge those who use the majority of it accordingly, then you are going to have to amortize it across everybody.
And then again in February 2007 in another blog post titled "Towards A Nanny Internet", she wrote, "Network neutrality is the idea that ISPs should be forced to charge everybody the same for their Internet use", grouping it together with proposed anti-bullying and anti-anonymity laws.
So, if a net neutrality law passes, don't be surprised when your costs to have an Internet account skyrocket.
Because somebody has to pay those bills, and if the law says that the ISPs can't charge the big guys - the big users - differently, it means that they have to charge them the same rate that they charge everyone else. And that means not that their rate will go down, but that everybody else's rate will go up.Well, points to Anne for being consistent, and for publicly declaring her views in no uncertain terms, which is all I'm asking of the other supporters of AboveNet's website blocking policy. (Although she's coming at it from a different angle this time, "How do we work out who pays for the traffic" rather than "ISPs should be allowed to block whatever they want without telling anybody".) But this is also a textbook example of what I think are the three major fallacies of opposition to Net Neutrality:
First, lumping it together with other examples of unpopular regulation and calling it one more example of Big Government -- an argument also tried in other editorials ("Politicians and public figures alike should realize the absurdity of advocating more red tape to keep the Internet free"). This meme has never really caught on, possibly because groups like the ACLU and the EFF that have traditionally opposed true Internet censorship, have lined up in favor of Net Neutrality. All the proposed "red tape" and "regulation" really says is that if a user attempts to access a Web site over a connection that they've paid for, the ISP may not block or slow down their access, a law which most people would hardly consider tyrannical.
Second, asserting that "Network neutrality is the idea that ISPs should be forced to charge everybody the same for their Internet use." I've never actually heard anyone advocate anything close to that, but a common question among skeptics is why different "tiers" for Internet traffic are really any different from different-tiered pricing for dial-up vs. DSL, or for different levels of Web hosting. The difference is that when users and Web site owners pay for those connections, they are paying for their respective connections to the rest of the Internet. But an ISP charging a Web site owner to carry their traffic the last mile to the user's house, is not charging for a product or service, but really charging a fee not to break a service that they've already agreed to provide to the user.
Which leads to the third misconception: "Here's the thing that the 3Ns (Net Neutrality Nuts) don't get: bandwidth costs money... So, if a net neutrality law passes, don't be surprised when your costs to have an Internet account skyrocket." But it's not about how much a service costs, but about the ethics of double-billing for it. We know that ISP pricing models can already support the total traffic that people consume today, and ISPs do already follow net neutrality principles most of the time, so nobody's costs will "skyrocket" just because a neutrality law passes. If vastly more people start trying to stream CNN over the Internet 24/7, and fully using the services that ISPs have "only been pretending to sell" as Brad Templeton put it, then ISPs may have to charge more for users who consume too much bandwidth, encouraging people to stay at today's average levels by rationing themselves and perhaps watching 24 on their $5,000 TV sets sometimes instead of downloading it off of BitTorrent to their laptop every week because it makes them feel like a haX0r. Much as we all love our unmetered connections, it wouldn't be a violation of Net Neutrality for ISPs to charge users for bandwidth hogging, to keep everyone from going too far above today's levels. What ISPs should not do is charge users for implied full-throttle connections, and then turn around to charge publishers for moving bits over those same lines, or block the connection for any other reason.
So, yes, Virginia, blocking of Web sites does happen -- and by "Virginia", I mean FTC Chairman Deborah Platt Majoras, who said in a speech in August 2006: "I have to say, thus far, proponents of net neutrality regulation have not come to us to explain where the market is failing or what anticompetitive conduct we should challenge; we are open to hearing from them." This was echoed in an editorial later that month from Sonia Arrison of the Pacific Research Institute:
Internet service providers have voluntarily upheld content-neutral practices without the need for government intervention, and consumers would never stand for blocked Web sites... If the loss of net neutrality principles was really a problem, advocates wouldn't need to scare Americans in order to win their support. Using government regulation preemptively to shortchange business partners is a reckless abuse of the public policy process. New laws should be based on facts and reality, not fear and hypothetical situations.
I guess both of those ladies' ISPs must be blocking access to the SaveTheInternet.com Web site, so I e-mailed both of them the coalition's list of examples, and added a note about the AboveNet/TeleGlobe incident as well. No personal response from either of them yet, but I'm sure they just got lost in the shuffle while they were so busy sending out corrections. (On the other hand, I did get a courteous response from Randolph J. May of the Free State Foundation, when I wrote to him about an editorial he penned which also argued that violations have not happened: "It is generally agreed that except for a few isolated and quickly remedied incidents, neither the cable operators nor the telephone companies providing broadband Internet services have blocked, impaired or otherwise restricted subscriber access to the content of unaffiliated entities." He said he hadn't known about the AboveNet/TeleGlobe incident either.)Another theme in some anti-Net-Neutrality editorials is that existing laws are enough to deal with the problem. In Majoras's speech, she said, "We should not forget that we already have in place an existing law enforcement and regulatory structure." Arrison's echoed that "Numerous federal agencies already have set a basic legal framework in place to preserve fair competition and business practices on the Internet". Well, as Yogi Berra says, in theory, there is no difference between theory and practice, but in practice, there is. After I found out AboveNet and TeleGlobe were blocking my Web site, I called about twenty lawyers in the Bellevue phone book, figuring: I wasn't greedy, but surely there would be financial damages for deceiving users and blocking our site, enough to pay a lawyer in return for handling the case? I think about two lawyers called me back, and they both said that even though what the backbone companies were doing clearly looked like fraud, it would take tens of thousands of dollars just to get started, and even if we ever got to court, the judge could call it however they wanted. Whatever laws exist now, they may help the slightly smaller big guy against the bigger big guy, but are not much use to the little or medium-sized guy.
So, any informed debate about Net Neutrality has to include the fact that, yes, some providers have blocked Web sites on purpose, for long periods of time, and no, the free market didn't fix it by itself. Even if something on that scale never happens again, if the free market and the anti-trust laws didn't automatically correct a case where Web sites were being blocked outright, then it's wishful thinking to think that those forces will prevent ISPs from merely slowing down Web access to sites that haven't paid a "toll", as they have made noises about doing. One AboveNet customer, Sam Knutson, said when he found out about the Web site blocking, "This type of behavior on the part of an ISP is reprehensible. I pay for a pipe and don't expect this type of monkey business." Well, I agree that it's reprehensible; whether we should "expect" more of it or not, depends on how much the Net Neutrality movement achieves its goals.
-
Yes Virginia, ISPs Have Silently Blocked Web Sites
Slashdot contributor Bennett Haselton writes "A recurring theme in editorials about Net Neutrality -- broadly defined as the principle that ISPs may not block or degrade access to sites based on their content or ownership (with exceptions for clearly delineated services like parental controls) -- is that it is a "solution in search of a problem", that ISPs in the free world have never actually blocked legal content on purpose. True, the movement is mostly motivated by statements by some ISPs about what they might do in the future, such as slow down customers' access to sites if the sites haven't paid a fast-lane "toll". But there was also an oft-forgotten episode in 2000 when it was revealed that two backbone providers, AboveNet and TeleGlobe, had been blocking users' access to certain Web sites for over a year -- not due to a configuration error, but by the choice of management within those companies. Maybe I'm biased, since one of the Web sites being blocked was mine. But I think this incident is more relevant than ever now -- not just because it shows that prolonged violations of Net Neutrality can happen, but because some of the people who organized or supported AboveNet's Web filtering, are people in fairly influential positions today, including the head of the Internet Systems Consortium, the head of the IRTF's Anti-Spam Research Group, and the operator of Spamhaus. Which begs the question: If they really believe that backbone companies have the right to silently block Web sites, are some of them headed for a rift with Net Neutrality supporters?" Read on for the rest of his story.In the aforementioned instance, AboveNet and TeleGlobe were not selling "parental filters" or other common types of filtered Internet access; the users being blocked from our Web sites were adults paying for what they thought were unfiltered Internet connections. What had happened was that AboveNet and TeleGlobe signed up to block Web sites on the Realtime Blackhole List, a list which was widely (but inaccurately) thought to be a list of "spammers", put out by a group called the Mail Abuse Prevention System. (MAPS and the RBL still exist, but under new management and in a form that bears little resemblance to their late-90's forerunners.) Most ISPs that used the RBL used it to filter only incoming e-mail, but AboveNet went all-out and blocked users from even viewing RBL'ed web sites, presumably because two of MAPS's founders, Paul Vixie and Dave Rand, were on the AboveNet board of directors. And it turned out that the RBL not only included spammers, but also Web sites that were not sending mail at all but were blocked because of their content -- in our case, our ISP got blocked because some other customers were selling mailing list software that MAPS believed could be too easily abused by spammers.
These two distinctions -- (1) the distinction between blocking incoming e-mail from spammers, versus blocking Web sites; and (2) the distinction between blocking traffic due to spam activity, versus blocking sites because of their content -- both go to the heart of what Net Neutrality is, and isn't, about. Net Neutrality is about user preferences -- not meaning that as a buzzword, but as an actual guiding principle to figure out what is and is not covered by the cause. If an ISP filters incoming mail from known spammers, that generally improves the user experience, and is something many users would expect an ISP to do anyway. But if an ISP blocks users from reaching Web sites (even, for the sake of argument, the Web sites of actual spammers), then that's generally counteracting the user's wishes -- if the user didn't want to go there, they wouldn't have typed it in. (After all, I visit spammers' Web sites all the time, usually right before I sue them.) Similarly, if an ISP blocks traffic from sites because of spam or other network abuse, that serves to protect their own users. But if an ISP blocks users from viewing sites because of their content, that's generally not expected by users, unless they've specifically signed up for something like parental controls. The Snowe Net Neutrality amendment proposed last year recognized both of these distinctions, and stated that nothing in the amendment would be interpreted to prohibit spam filtering, parental control services, or measures to protect network security.
The MAPS incident thus shaped most of my opinions about Net Neutrality 6 years before the debate even had a name. When I first found out in August 2000 that our ISP was blacklisted, like most people I believed that the RBL really was a list of spammers; after all the MAPS web page said that the RBL was a list of networks that "originate or relay spam". So I called my ISP screaming at them for being incompetent spam-enablers (the culmination of many frustrating issues with them), and saying that if they really were letting customers send spam, or running an insecure server that spammers were hijacking, I would leave on principle, if the cretins managing our server didn't drop it in the lake first. The ISP owner then told me what happened: that the ISP was not blacklisted for spamming customers, but because of the content of the other sites. (Buried in the list of RBL criteria on MAPS's site was the statement that sites could be blacklisted for providing "spam software", although the criteria did not define how they distinguished between spam software and regular mailing list software, which is how our ISP got caught in the net. And the criteria did not disclose anywhere the most controversial feature of the RBL, which is that if an ISP didn't comply, MAPS would start blacklisting other unrelated sites at the same ISP to put more pressure on them.) I agreed that this seemed to be absurd, and said I wouldn't leave the ISP if they were being blackballed just because of the content of hosted pages.
I don't know exactly what the mail software in question did or where MAPS thought the line should be drawn, but I am a purist about content -- it's a long-standing principle among the Internet security community that if a tool exists which exploits a security hole, you don't try to make the software disappear, you fix the hole. And besides, since MAPS and their supporters wanted to blackball ISPs that hosted spamming software (however you defined that), but the same people had never advocated blackballing ISPs that hosted network break-in tools and other cracking programs, for example, then what were they really saying? That spamming someone more unethical than breaking into their network?
But by far the most common objection to my complaint about AboveNet blocking Web sites was, "Hey, if a private company blocks things, as long as they're being honest to their users about it, who cares?" Well, true, but the fact that AboveNet blocked Web sites was not widely known even within the company; when I once called AboveNet feigning ignorance and asking them if they blocked RBL'ed Web sites, the technician who spoke to me said, "No, that wouldn't make any sense." (Well, half right.) Their AUP mentioned "protecting users from spam" but said nothing about blocking Web sites. In fact, other than "family-filtered" ISPs and similar services, I've never heard of any company blocking Web sites that actually did try to make their users aware of it. (On the other hand, even if AboveNet had fully disclosed their filtering, they were still a backbone company selling connectivity mainly to ISPs -- and I think if you sell something wholesale that can only be re-sold to the public by fraudulent means, then you're at least partly complicit in that fraud as well.)
If you're tempted to argue that backbone providers should be allowed to block whatever they want as long as they bury it in their AUP (although AboveNet and TeleGlobe didn't even do that much), just consider: When you access Google from your home computer, have you read the AUP of every network that the packets pass through, to check whether they reserve the right to block or even modify your traffic? Without doing a traceroute, could you even name all the networks that the traffic passes through? Do you really want the burden to be on you to check with all of them every time there's a problem reaching a Web site? Or do you feel like there's an understanding that as long as you pay your bill, they should let you go wherever you want?
Some have argued that if an ISP blocks the user from reaching a Web site, then even if the ISP is defrauding the user, that's still strictly an issue between the user and the ISP. But if a user is trying to reach your Web site, the user is trying to give you something of value: their attention, their eyeballs on your advertisements, sometimes even their money (with the expectation that you will provide them with something in return, of course, like some content worth reading). If the ISP steps in and blocks that, then the ISP has taken something of value that the user was attempting to give to you, and diverted it to serve their own interests. To me that doesn't seem ethically much different from the FedEx driver swiping the chocolates that someone tried to send you for Valentine's Day. Is that just between the sender and FedEx? Or do you have a beef because you didn't get the present that was intended for you, and you had to eat last week's chocolates to cheer up?
The modern-day threats to Net Neutrality are different: slowing access to Web sites unless the site owners pay a "toll", instead of blocking access to sites because of the content of other sites hosted at the same ISP. But they both boil down to the same thing: not giving end users what they have already paid for. If a user buys Internet access, they almost always buy it with the understanding that if they access a site, the content will download as quickly as their connection allows.
Thus the most common misconception about Net Neutrality is that the proponents are fighting against "capitalism" -- ISPs just charging more for different delivery speeds. But ISPs are already charging users for those delivery lines -- including different tiers for different prices. That's capitalism, and it works, with prices falling all the time in a fairly competitive market. But charging publishers for those higher delivery speeds to the user's house, is really more like double-billing, because the user has already been charged once for the lines that the content is coming over, so the ISP is trying to charge the content publisher again for the same service. Of course, if you charge party A for doing X, and then you try to charge party B for the same instance of doing X, and party B doesn't pay up so you don't do X, you're also breaking your deal with A. Brad Templeton of the EFF stated as much on his blog in 2006:
The pipes start off belonging to the ISPs but they sell them to their customers. The customers are buying their line to the middle, where they meet the line from the other user or site they want to talk to. The problem is generated because the carriers all price the lines at lower than they might have to charge if they were all fully saturated, since most users only make limited, partial use of the lines. When new apps increase the amount a typical user needs, it alters the economics of the ISP. They could deal with that by raising prices and really delivering the service they only pretend to sell, or by charging the other end, and breaking the cost contract. They've rattled sabres about doing the latter.
And I think the same is clearly true if, instead of trying to extract money from the content publisher, the ISP tries to extract something else, like an agreement to shut down certain Web sites before the ISP will let their users view other sites hosted at the same company. You can talk all day about how evil those Web sites are, but the ISP has already sold the user a connection with the implied ability to access them.Anyway, this all came out in 2000 when a Slashdot article revealed that AboveNet had been blocking Web sites, and AboveNet stopped doing it two hours after the article came out. (TeleGlobe stuck with it for a few more months.) But from the hostility of the reaction, you'd think that we had published cartoons in a Danish newspaper showing Paul Vixie with a bomb in his turban. I got more e-mails than I could count arguing that AboveNet had the right to block whatever Web sites they felt like, regardless of whether the end users knew it was happening. To those people, I'd be sincerely interested in their answer to this question: Does that mean they've have no problem if they found out their ISP was silently blocking sites for political reasons? There is a clear line between following user preferences by blocking spam, and countermanding user preferences by blocking sites because of their content -- and once you've crossed that line, where's the logical stopping point? Seriously, I would have liked to have known how they would answer that, if I could have gotten any meaningful dialog going with them, which most of the time I couldn't. At the time, I'd just spent four years telling people that kids looking at porn was a non-issue, and that by the way if their kids came to my Web site I'd even help them get around their blocking software, and I still got more angry e-mails for disclosing the fact that AboveNet blocked Web sites based on their content, than I'd gotten in all the previous four years combined. (A few even accused us of moving into a blacklisted address block on purpose. This was because the actual move happened after the blacklisting was in place, even though I told them all that our ISP had announced the coming move two months before -- repeat, before -- they ever heard from MAPS. Some people were so in love with that "smoking gun" that they didn't believe me; that's their prerogative. But don't take my word for it -- when one supporter wrote to MAPS to ask about un-blocking our site, MAPS officer Kelly Thompson replied:
>Would it be possible to
It was MAPS's decision, not ours or our ISP's, to have our site blocked. That should settle that once and for all, just as soon as there is peace in the Middle East and a black lesbian in the White House.)
>selectively unblock peacefire.org (209.211.253.169)?
Technically? Yes, it is. It's a violation of our policy, though, so I can't do so.
I would be willing to help you find other free or reduced cost hosting, however.
But what do all these people think about Net Neutrality, 6 years later? I tried to track down the influential people who had spoken out supporting AboveNet's blocking of Web sites, or at least their right to block Web sites. My position was, we can agree to disagree on that, but if they really feel that way, why haven't they been speaking out against Net Neutrality? The proposed Snowe amendment was pretty clear:
SEC. 12. INTERNET NEUTRALITY
(a) Duty of Broadband Service Providers- With respect to any broadband service offered to the public, each broadband service provider shall--
(1) not block, interfere with, discriminate against, impair, or degrade the ability of any person to use a broadband service to access, use, send, post, receive, or offer any lawful content, application, or service made available via the Internet.John Levine, webmaster of Abuse.Net, head of the IRTF's Anti-Spam Research Group, and one of the most vocal critics of Peacefire's campaign against AboveNet's Web filtering, said that he would have opposed the bill but didn't bother because it didn't have much chance of passing. Well, it didn't, but the bill was significant not because of its likelihood of passage, but because it articulated the principles that the Net Neutrality coalition had rallied around, and with the momentum behind the movement, it's likely to achieve at least some of its goals, by legislation or otherwise.
Paul Vixie, Dave Rand, and Steve Linford did not respond to requests for comment on Net Neutrality. But Paul Vixie wrote something very interesting in a May 2006 blog post:
Second, there's network neutrality. In telephone service, the government mandates that all companies providing voice-grade telephony interconnect with eachother at preset rates, thus ensuring that any phone can call any other phone and that new phone companies can enter the field to help ensure competition. In Internet service, the government mandates nothing. Recently SBC (I mean AT&T, I think, is it Wednesday?) rattled its sabre and said that Google and other content supplying companies should be paying for the use of SBC's backbone to reach SBC's eyeballs. Most of us said, uh, what? "Aren't SBC's own customers paying SBC to carry that traffic?" Some of us even said "I am not an eyeball, I am a person!" But anyway, from time to time these Internet companies shut down interconnects in hopes of creating new cash flows among eachother, and until the government regulates this, we're all at risk of higher prices or lower service with zero notice. Some well meaning democrats are trying to challenge this with "network neutrality" legislation, but this probably isn't their year. Or their decade.
San Francisco has a government, though. And if San Francisco owned and operated its own wireless Internet plant, we could mandate that any Internet company wishing to do business in this city interconnect at fair and reasonable cost to all other Internet companies wishing to do business in this city.
"Until the government regulates this"? "Government mandates"? "Fair and reasonable cost"? Quick, call the anti-socialist intervention squad! How long does it take those San Francisco hippies to suck the new arrivals' brains out anyway? Of course, I agree with everything he said. It's just that if you replace "create new cash flows" with "try to get ISPs to remove content from their servers", this describes exactly what Vixie and AboveNet were doing a few years earlier. He's a smart guy, and I'm sure this didn't escape his sense of irony, so perhaps this confirms something I'd suspected all along, which is that Vixie understood the subtleties of the issue better than most of his cheerleaders, and may be having second thoughts about AboveNet's Web-blocking misadventure. From the beginning, in a 1997 interview with Sun World, he sounded like someone trying to at least keep an open mind:
Concentration of power into a single individual: It's very true that power has corrupted every individual in whom it has ever been concentrated in the history of mankind. I do not feel that I am necessarily above whatever elements of human nature give rise to that. I worry about it. Probably other people worry about it more than I do.
Although, he didn't get to making any such frank statements during the controversy over AboveNet's Web site blocking. (Perhaps MAPS's lawyers were worried that he was a little too unfiltered and advised him not to comment; at the time, the MAPS Web site had a "How to sue MAPS" link on the front page.)Speaking of which, Anne Mitchell, Director of Legal and Public Affairs for MAPS during the time when AboveNet was blocking Web sites, was the only MAPS adherent from the era that I could find who has since clearly and publicly come out against Net Neutrality. In May 2006 she wrote:
Here's the thing that the 3Ns (Net Neutrality Nuts) don't get: bandwidth costs money. And if you can't charge those who use the majority of it accordingly, then you are going to have to amortize it across everybody.
And then again in February 2007 in another blog post titled "Towards A Nanny Internet", she wrote, "Network neutrality is the idea that ISPs should be forced to charge everybody the same for their Internet use", grouping it together with proposed anti-bullying and anti-anonymity laws.
So, if a net neutrality law passes, don't be surprised when your costs to have an Internet account skyrocket.
Because somebody has to pay those bills, and if the law says that the ISPs can't charge the big guys - the big users - differently, it means that they have to charge them the same rate that they charge everyone else. And that means not that their rate will go down, but that everybody else's rate will go up.Well, points to Anne for being consistent, and for publicly declaring her views in no uncertain terms, which is all I'm asking of the other supporters of AboveNet's website blocking policy. (Although she's coming at it from a different angle this time, "How do we work out who pays for the traffic" rather than "ISPs should be allowed to block whatever they want without telling anybody".) But this is also a textbook example of what I think are the three major fallacies of opposition to Net Neutrality:
First, lumping it together with other examples of unpopular regulation and calling it one more example of Big Government -- an argument also tried in other editorials ("Politicians and public figures alike should realize the absurdity of advocating more red tape to keep the Internet free"). This meme has never really caught on, possibly because groups like the ACLU and the EFF that have traditionally opposed true Internet censorship, have lined up in favor of Net Neutrality. All the proposed "red tape" and "regulation" really says is that if a user attempts to access a Web site over a connection that they've paid for, the ISP may not block or slow down their access, a law which most people would hardly consider tyrannical.
Second, asserting that "Network neutrality is the idea that ISPs should be forced to charge everybody the same for their Internet use." I've never actually heard anyone advocate anything close to that, but a common question among skeptics is why different "tiers" for Internet traffic are really any different from different-tiered pricing for dial-up vs. DSL, or for different levels of Web hosting. The difference is that when users and Web site owners pay for those connections, they are paying for their respective connections to the rest of the Internet. But an ISP charging a Web site owner to carry their traffic the last mile to the user's house, is not charging for a product or service, but really charging a fee not to break a service that they've already agreed to provide to the user.
Which leads to the third misconception: "Here's the thing that the 3Ns (Net Neutrality Nuts) don't get: bandwidth costs money... So, if a net neutrality law passes, don't be surprised when your costs to have an Internet account skyrocket." But it's not about how much a service costs, but about the ethics of double-billing for it. We know that ISP pricing models can already support the total traffic that people consume today, and ISPs do already follow net neutrality principles most of the time, so nobody's costs will "skyrocket" just because a neutrality law passes. If vastly more people start trying to stream CNN over the Internet 24/7, and fully using the services that ISPs have "only been pretending to sell" as Brad Templeton put it, then ISPs may have to charge more for users who consume too much bandwidth, encouraging people to stay at today's average levels by rationing themselves and perhaps watching 24 on their $5,000 TV sets sometimes instead of downloading it off of BitTorrent to their laptop every week because it makes them feel like a haX0r. Much as we all love our unmetered connections, it wouldn't be a violation of Net Neutrality for ISPs to charge users for bandwidth hogging, to keep everyone from going too far above today's levels. What ISPs should not do is charge users for implied full-throttle connections, and then turn around to charge publishers for moving bits over those same lines, or block the connection for any other reason.
So, yes, Virginia, blocking of Web sites does happen -- and by "Virginia", I mean FTC Chairman Deborah Platt Majoras, who said in a speech in August 2006: "I have to say, thus far, proponents of net neutrality regulation have not come to us to explain where the market is failing or what anticompetitive conduct we should challenge; we are open to hearing from them." This was echoed in an editorial later that month from Sonia Arrison of the Pacific Research Institute:
Internet service providers have voluntarily upheld content-neutral practices without the need for government intervention, and consumers would never stand for blocked Web sites... If the loss of net neutrality principles was really a problem, advocates wouldn't need to scare Americans in order to win their support. Using government regulation preemptively to shortchange business partners is a reckless abuse of the public policy process. New laws should be based on facts and reality, not fear and hypothetical situations.
I guess both of those ladies' ISPs must be blocking access to the SaveTheInternet.com Web site, so I e-mailed both of them the coalition's list of examples, and added a note about the AboveNet/TeleGlobe incident as well. No personal response from either of them yet, but I'm sure they just got lost in the shuffle while they were so busy sending out corrections. (On the other hand, I did get a courteous response from Randolph J. May of the Free State Foundation, when I wrote to him about an editorial he penned which also argued that violations have not happened: "It is generally agreed that except for a few isolated and quickly remedied incidents, neither the cable operators nor the telephone companies providing broadband Internet services have blocked, impaired or otherwise restricted subscriber access to the content of unaffiliated entities." He said he hadn't known about the AboveNet/TeleGlobe incident either.)Another theme in some anti-Net-Neutrality editorials is that existing laws are enough to deal with the problem. In Majoras's speech, she said, "We should not forget that we already have in place an existing law enforcement and regulatory structure." Arrison's echoed that "Numerous federal agencies already have set a basic legal framework in place to preserve fair competition and business practices on the Internet". Well, as Yogi Berra says, in theory, there is no difference between theory and practice, but in practice, there is. After I found out AboveNet and TeleGlobe were blocking my Web site, I called about twenty lawyers in the Bellevue phone book, figuring: I wasn't greedy, but surely there would be financial damages for deceiving users and blocking our site, enough to pay a lawyer in return for handling the case? I think about two lawyers called me back, and they both said that even though what the backbone companies were doing clearly looked like fraud, it would take tens of thousands of dollars just to get started, and even if we ever got to court, the judge could call it however they wanted. Whatever laws exist now, they may help the slightly smaller big guy against the bigger big guy, but are not much use to the little or medium-sized guy.
So, any informed debate about Net Neutrality has to include the fact that, yes, some providers have blocked Web sites on purpose, for long periods of time, and no, the free market didn't fix it by itself. Even if something on that scale never happens again, if the free market and the anti-trust laws didn't automatically correct a case where Web sites were being blocked outright, then it's wishful thinking to think that those forces will prevent ISPs from merely slowing down Web access to sites that haven't paid a "toll", as they have made noises about doing. One AboveNet customer, Sam Knutson, said when he found out about the Web site blocking, "This type of behavior on the part of an ISP is reprehensible. I pay for a pipe and don't expect this type of monkey business." Well, I agree that it's reprehensible; whether we should "expect" more of it or not, depends on how much the Net Neutrality movement achieves its goals.
-
From Bess to Worse
Frequent Slashdot contributor Bennett Haselton writes " From about 1996 to 2003, there were regular reports listing examples of sites stupidly blocked by blocking software. The genre has tapered off recently, probably as a result of the Supreme Court ruling in 2003 that the Children's Internet Protection Act (CIPA) was constitutional, requiring blocking software in schools and libraries that receive federal funds, despite all the evidence of over-blocking presented at the trial. The last high-profile story about a site blocked by blocking software was about the blocking of BoingBoing almost a year ago. But the lack of recent reports on blocking software errors doesn't mean that the software has gotten better." The rest of his essay follows.One product that generated several reports over the years was "Bess, the Internet Retriever" from N2H2, which has since been bought out by Secure Computing, which also makes a blocking program called SmartFilter (the one that blocked BoingBoing) and now sells "SmartFilter, Bess Edition" which uses the same database as Bess. Different organizations and individuals published a series of investigative reports about Bess from 1997 until 2002, listing sites about gay rights, eating disorders, and other subjects that were blocked as "pornography". In Ben Edelman's supplemental report, submitted as testimony in the CIPA trial, he listed examples of erroneously blocked sites that he had reported to N2H2 in his first expert report, and which were still being blocked five months later.
Since Bess represents a set of data points showing how the accuracy of a blocking program can change, or not change, over the years, recently I began testing it again. I didn't know whether to expect it to be better or worse. On the one hand, advances in technology and greater revenue to censorware companies could have caused the software to improve. On the other hand, the number of Web pages, and the rate at which dynamic sites like blogs change content every day, has exploded. The result? I'm still tabulating data, but it looks as if the accuracy rate is roughly the same as it was in 2000, when about 30% of blocked sites were obvious errors. Then and now, I found most of the errors by starting with a large list of URLs culled from search engines and other sources, and simply running them through the software to see what was blocked.
Here is a partial list of some of the questionable categorizations made by Bess; as of this writing, all of the following sites are listed as "Pornography" when you look them up on Secure Computing's Bess lookup form. (This is not just a fluke of the lookup tool; I tested against a copy of the software that all of these sites really were blocked.) The "screen cap" link next to each site links to a snapshot of the results taken from the lookup form (you can check on http://database.n2h2.com/ to see if the page is still returning the same results, although the more obvious errors will probably be fixed after this article is published):
- The Electronic Frontier Foundation, Austin chapter (screen cap)
- Cretans of Houston (screen cap). That's Cretans, as in "people from the island of Crete". Not to be confused with the Cretins of Houston, located here.
- The Rhode Island Coalition Against Domestic Violence (screen cap)
- The website of the public art galleries of British Columbia, Canada (screen cap)
- Rail2000, now the Bay Rail Alliance, a consumer group lobbying for a San Francisco regional rail system (screen cap)
- Rainbow Service Organization, a gay rights advocacy group (screen cap)
- GardenMentors.com, a custom gardening services company in Seattle (screen cap)
- A web site for Catalina 380 series boats (screen cap)
- Open Source ERP, a site promoting open source software for enterprise resource planning and customer relationship management (screen cap)
- The Bryn Mawr Mainliners, a barbershop harmony group (screen cap)
- Timber Trails, an outdoor recreation site (screen cap)
- The MEFTA Institute: "Middle East Free Trade Areas for Business Peace" -- world peace through cheap oil! (screen cap)
- Topple Rummy, a (somewhat out-of-date) site calling for the ouster of Donald Rumsfeld (screen cap)
- The Alabama Network of Children's Advocacy Centers (screen cap)
- PSARA, a non-profit organization for training cruise travel agents (screen cap)
- Park Place Behavioral Health Care, a non-profit mental health care agency (screen cap)
- The Oklahoma chapter of the American Institute of Building Design (screen cap)
- The Boys & Girls Clubs of Metropolitan Phoenix (screen cap)
- CEMTACH -- Computational ElectroMagnetics Theory-Algorithm-Code-Hardware. "Our goal is to develop systems simulations capabilities based on time-domain computational electromagnetics methods." Thanks for clearing that up. (screen cap)
- Fund for Humanity, a San Francisco non-profit supporting environmental organizations and organizations that assist the poor. (screen cap)
A long-standing point of contention while earlier reports about Bess were coming out, was whether every site on their blacklist had been reviewed by a human before being blocked. In 1998 the CEO testified before Congress that "All sites that are blocked are reviewed by N2H2 staff before being added to the block lists." However in their 2002 annual report the company finally admitted that not all sites were reviewed before being blocked: "Through automated categorization or human review, Web sites are identified as fitting into one or more of our categories". At one point an N2H2 employee also told me that when one site is blocked, they will often block all sites hosted on that machine or at that IP -- which of course means that those sites are also not reviewed before being blocked. In any case, it's possible to access some of these sites by IP address, such as the BC Art Galleries site via this link, or the or the Rhode Island Coalition Against Domestic Violence via this link -- so if they're not sharing their IP with other sites, that wouldn't explain how they got blocked either. Smartfilter spokesperson Tomo Foote-Lennox said that one other blocked URL that I found, http://www.arbiol.org/, was the result of an experiment N2H2 once did with fully automated website ratings.
Foote-Lennox added, "In general, we find that schools are VERY sensitive to under-blocking. The would rather block a whole lot of useful reference sites to avoid exposing one porn site." Probably true, although keep in mind we're talking about liability issues, not actual moral outrage. (If they were really morally outraged, they'd be trying to keep kids away from uncensored Internet access everywhere, not just in school! That is in fact the approach that schools take with things like drugs, which do inspire moral outrage because they really are harmful.) Perhaps what is needed is a law explicitly shielding schools from all liability for what students do or see on the Internet at school, if the faculty had no knowledge of it.
(Obligatory interstitial advertisement for common sense: I still don't see what the big deal is about porn anyway. Ask yourself: Why is it harmful to see a picture of a naked person, or even a picture of people having sex? And try to find an answer to that question that doesn't involve, "Lots of other people think so." That includes all variations like "Our society has determined...", "We as a people have decided...", which are just re-phrasings of "Lots of other people think so." I submit that if you disallow those variations of grownup-peer-pressure as an excuse, most people can't really come up with any reason at all.)
OK, flame-retardant suit off, lab coat back on. Previous reports have listed absurd examples of sites blocked by Bess, and looking at any one of those examples or the ones listed here, I'd say that in terms of public policy discussions -- specifically, whether a blocking software company should be trusted to decide what students can look at -- any one of these blocked sites would be more significant than, say, the blocking of BoingBoing which got so much attention. BoingBoing got blocked because of a non-sexual picture of a bare breast on the cover of one of the books they reviewed -- and in fact they were blocked only in the "nudity" category, which includes only "non-pornographic images of the bare human body". So the block on BoingBoing really only revealed that Secure Computing was a bit heavy-handed. (The real problem is that SmartFilter has the category for non-pornographic nudity blocked by default, even though the CIPA filtering law certainly doesn't require schools to block non-pornographic artistic images!) On the other hand, the fact that EFF Austin and the Rhode Island Coalition Against Domestic Violence are currently blocked as "Pornography", suggests that in many instances the blocking companies have nobody at the controls at all. To focus on stupid-but-not-completely-insane blocks like BoingBoing is letting them off easy.
So why did the laundry lists of blocked sites released over the years never become as widely known as BoingBoing, or the guffaw-inducing examples like "Beaver College", which had to change their name in part because of students reportedly being blocked from accessing their website? I think it's because the news favors a good "punch line" -- a fact that anybody can understand that makes us feel smarter than the computers making these dumb mistakes. "Oh, I get it, it was blocked because it was called Beaver College!" But the "punch line" anecdotes are precisely the ones that let the blocking companies off lightly, because it gives them a plausible-sounding excuse for making an error. On the other hand, when the Rhode Island Coalition Against Domestic Violence gets blocked as "Pornography", that could probably force the blocking company to answer some tough questions if it got more press, but there's no good punch line there, so the story just fizzles.
So, while I'm looking through the rest of the data, let me try and come up with some punch lines for reporters to make these blocked sites newsworthy. OK: Why was GardenMentor.com blocked? To keep kids away from all the dirty bitches and hoes! Get it? Ha ha! Why was the Catalina 380 yachting site blocked from kids? Because teens are too vulnerable to pier pressure! Hey, where are you going?
-
From Bess to Worse
Frequent Slashdot contributor Bennett Haselton writes " From about 1996 to 2003, there were regular reports listing examples of sites stupidly blocked by blocking software. The genre has tapered off recently, probably as a result of the Supreme Court ruling in 2003 that the Children's Internet Protection Act (CIPA) was constitutional, requiring blocking software in schools and libraries that receive federal funds, despite all the evidence of over-blocking presented at the trial. The last high-profile story about a site blocked by blocking software was about the blocking of BoingBoing almost a year ago. But the lack of recent reports on blocking software errors doesn't mean that the software has gotten better." The rest of his essay follows.One product that generated several reports over the years was "Bess, the Internet Retriever" from N2H2, which has since been bought out by Secure Computing, which also makes a blocking program called SmartFilter (the one that blocked BoingBoing) and now sells "SmartFilter, Bess Edition" which uses the same database as Bess. Different organizations and individuals published a series of investigative reports about Bess from 1997 until 2002, listing sites about gay rights, eating disorders, and other subjects that were blocked as "pornography". In Ben Edelman's supplemental report, submitted as testimony in the CIPA trial, he listed examples of erroneously blocked sites that he had reported to N2H2 in his first expert report, and which were still being blocked five months later.
Since Bess represents a set of data points showing how the accuracy of a blocking program can change, or not change, over the years, recently I began testing it again. I didn't know whether to expect it to be better or worse. On the one hand, advances in technology and greater revenue to censorware companies could have caused the software to improve. On the other hand, the number of Web pages, and the rate at which dynamic sites like blogs change content every day, has exploded. The result? I'm still tabulating data, but it looks as if the accuracy rate is roughly the same as it was in 2000, when about 30% of blocked sites were obvious errors. Then and now, I found most of the errors by starting with a large list of URLs culled from search engines and other sources, and simply running them through the software to see what was blocked.
Here is a partial list of some of the questionable categorizations made by Bess; as of this writing, all of the following sites are listed as "Pornography" when you look them up on Secure Computing's Bess lookup form. (This is not just a fluke of the lookup tool; I tested against a copy of the software that all of these sites really were blocked.) The "screen cap" link next to each site links to a snapshot of the results taken from the lookup form (you can check on http://database.n2h2.com/ to see if the page is still returning the same results, although the more obvious errors will probably be fixed after this article is published):
- The Electronic Frontier Foundation, Austin chapter (screen cap)
- Cretans of Houston (screen cap). That's Cretans, as in "people from the island of Crete". Not to be confused with the Cretins of Houston, located here.
- The Rhode Island Coalition Against Domestic Violence (screen cap)
- The website of the public art galleries of British Columbia, Canada (screen cap)
- Rail2000, now the Bay Rail Alliance, a consumer group lobbying for a San Francisco regional rail system (screen cap)
- Rainbow Service Organization, a gay rights advocacy group (screen cap)
- GardenMentors.com, a custom gardening services company in Seattle (screen cap)
- A web site for Catalina 380 series boats (screen cap)
- Open Source ERP, a site promoting open source software for enterprise resource planning and customer relationship management (screen cap)
- The Bryn Mawr Mainliners, a barbershop harmony group (screen cap)
- Timber Trails, an outdoor recreation site (screen cap)
- The MEFTA Institute: "Middle East Free Trade Areas for Business Peace" -- world peace through cheap oil! (screen cap)
- Topple Rummy, a (somewhat out-of-date) site calling for the ouster of Donald Rumsfeld (screen cap)
- The Alabama Network of Children's Advocacy Centers (screen cap)
- PSARA, a non-profit organization for training cruise travel agents (screen cap)
- Park Place Behavioral Health Care, a non-profit mental health care agency (screen cap)
- The Oklahoma chapter of the American Institute of Building Design (screen cap)
- The Boys & Girls Clubs of Metropolitan Phoenix (screen cap)
- CEMTACH -- Computational ElectroMagnetics Theory-Algorithm-Code-Hardware. "Our goal is to develop systems simulations capabilities based on time-domain computational electromagnetics methods." Thanks for clearing that up. (screen cap)
- Fund for Humanity, a San Francisco non-profit supporting environmental organizations and organizations that assist the poor. (screen cap)
A long-standing point of contention while earlier reports about Bess were coming out, was whether every site on their blacklist had been reviewed by a human before being blocked. In 1998 the CEO testified before Congress that "All sites that are blocked are reviewed by N2H2 staff before being added to the block lists." However in their 2002 annual report the company finally admitted that not all sites were reviewed before being blocked: "Through automated categorization or human review, Web sites are identified as fitting into one or more of our categories". At one point an N2H2 employee also told me that when one site is blocked, they will often block all sites hosted on that machine or at that IP -- which of course means that those sites are also not reviewed before being blocked. In any case, it's possible to access some of these sites by IP address, such as the BC Art Galleries site via this link, or the or the Rhode Island Coalition Against Domestic Violence via this link -- so if they're not sharing their IP with other sites, that wouldn't explain how they got blocked either. Smartfilter spokesperson Tomo Foote-Lennox said that one other blocked URL that I found, http://www.arbiol.org/, was the result of an experiment N2H2 once did with fully automated website ratings.
Foote-Lennox added, "In general, we find that schools are VERY sensitive to under-blocking. The would rather block a whole lot of useful reference sites to avoid exposing one porn site." Probably true, although keep in mind we're talking about liability issues, not actual moral outrage. (If they were really morally outraged, they'd be trying to keep kids away from uncensored Internet access everywhere, not just in school! That is in fact the approach that schools take with things like drugs, which do inspire moral outrage because they really are harmful.) Perhaps what is needed is a law explicitly shielding schools from all liability for what students do or see on the Internet at school, if the faculty had no knowledge of it.
(Obligatory interstitial advertisement for common sense: I still don't see what the big deal is about porn anyway. Ask yourself: Why is it harmful to see a picture of a naked person, or even a picture of people having sex? And try to find an answer to that question that doesn't involve, "Lots of other people think so." That includes all variations like "Our society has determined...", "We as a people have decided...", which are just re-phrasings of "Lots of other people think so." I submit that if you disallow those variations of grownup-peer-pressure as an excuse, most people can't really come up with any reason at all.)
OK, flame-retardant suit off, lab coat back on. Previous reports have listed absurd examples of sites blocked by Bess, and looking at any one of those examples or the ones listed here, I'd say that in terms of public policy discussions -- specifically, whether a blocking software company should be trusted to decide what students can look at -- any one of these blocked sites would be more significant than, say, the blocking of BoingBoing which got so much attention. BoingBoing got blocked because of a non-sexual picture of a bare breast on the cover of one of the books they reviewed -- and in fact they were blocked only in the "nudity" category, which includes only "non-pornographic images of the bare human body". So the block on BoingBoing really only revealed that Secure Computing was a bit heavy-handed. (The real problem is that SmartFilter has the category for non-pornographic nudity blocked by default, even though the CIPA filtering law certainly doesn't require schools to block non-pornographic artistic images!) On the other hand, the fact that EFF Austin and the Rhode Island Coalition Against Domestic Violence are currently blocked as "Pornography", suggests that in many instances the blocking companies have nobody at the controls at all. To focus on stupid-but-not-completely-insane blocks like BoingBoing is letting them off easy.
So why did the laundry lists of blocked sites released over the years never become as widely known as BoingBoing, or the guffaw-inducing examples like "Beaver College", which had to change their name in part because of students reportedly being blocked from accessing their website? I think it's because the news favors a good "punch line" -- a fact that anybody can understand that makes us feel smarter than the computers making these dumb mistakes. "Oh, I get it, it was blocked because it was called Beaver College!" But the "punch line" anecdotes are precisely the ones that let the blocking companies off lightly, because it gives them a plausible-sounding excuse for making an error. On the other hand, when the Rhode Island Coalition Against Domestic Violence gets blocked as "Pornography", that could probably force the blocking company to answer some tough questions if it got more press, but there's no good punch line there, so the story just fizzles.
So, while I'm looking through the rest of the data, let me try and come up with some punch lines for reporters to make these blocked sites newsworthy. OK: Why was GardenMentor.com blocked? To keep kids away from all the dirty bitches and hoes! Get it? Ha ha! Why was the Catalina 380 yachting site blocked from kids? Because teens are too vulnerable to pier pressure! Hey, where are you going?
-
From Bess to Worse
Frequent Slashdot contributor Bennett Haselton writes " From about 1996 to 2003, there were regular reports listing examples of sites stupidly blocked by blocking software. The genre has tapered off recently, probably as a result of the Supreme Court ruling in 2003 that the Children's Internet Protection Act (CIPA) was constitutional, requiring blocking software in schools and libraries that receive federal funds, despite all the evidence of over-blocking presented at the trial. The last high-profile story about a site blocked by blocking software was about the blocking of BoingBoing almost a year ago. But the lack of recent reports on blocking software errors doesn't mean that the software has gotten better." The rest of his essay follows.One product that generated several reports over the years was "Bess, the Internet Retriever" from N2H2, which has since been bought out by Secure Computing, which also makes a blocking program called SmartFilter (the one that blocked BoingBoing) and now sells "SmartFilter, Bess Edition" which uses the same database as Bess. Different organizations and individuals published a series of investigative reports about Bess from 1997 until 2002, listing sites about gay rights, eating disorders, and other subjects that were blocked as "pornography". In Ben Edelman's supplemental report, submitted as testimony in the CIPA trial, he listed examples of erroneously blocked sites that he had reported to N2H2 in his first expert report, and which were still being blocked five months later.
Since Bess represents a set of data points showing how the accuracy of a blocking program can change, or not change, over the years, recently I began testing it again. I didn't know whether to expect it to be better or worse. On the one hand, advances in technology and greater revenue to censorware companies could have caused the software to improve. On the other hand, the number of Web pages, and the rate at which dynamic sites like blogs change content every day, has exploded. The result? I'm still tabulating data, but it looks as if the accuracy rate is roughly the same as it was in 2000, when about 30% of blocked sites were obvious errors. Then and now, I found most of the errors by starting with a large list of URLs culled from search engines and other sources, and simply running them through the software to see what was blocked.
Here is a partial list of some of the questionable categorizations made by Bess; as of this writing, all of the following sites are listed as "Pornography" when you look them up on Secure Computing's Bess lookup form. (This is not just a fluke of the lookup tool; I tested against a copy of the software that all of these sites really were blocked.) The "screen cap" link next to each site links to a snapshot of the results taken from the lookup form (you can check on http://database.n2h2.com/ to see if the page is still returning the same results, although the more obvious errors will probably be fixed after this article is published):
- The Electronic Frontier Foundation, Austin chapter (screen cap)
- Cretans of Houston (screen cap). That's Cretans, as in "people from the island of Crete". Not to be confused with the Cretins of Houston, located here.
- The Rhode Island Coalition Against Domestic Violence (screen cap)
- The website of the public art galleries of British Columbia, Canada (screen cap)
- Rail2000, now the Bay Rail Alliance, a consumer group lobbying for a San Francisco regional rail system (screen cap)
- Rainbow Service Organization, a gay rights advocacy group (screen cap)
- GardenMentors.com, a custom gardening services company in Seattle (screen cap)
- A web site for Catalina 380 series boats (screen cap)
- Open Source ERP, a site promoting open source software for enterprise resource planning and customer relationship management (screen cap)
- The Bryn Mawr Mainliners, a barbershop harmony group (screen cap)
- Timber Trails, an outdoor recreation site (screen cap)
- The MEFTA Institute: "Middle East Free Trade Areas for Business Peace" -- world peace through cheap oil! (screen cap)
- Topple Rummy, a (somewhat out-of-date) site calling for the ouster of Donald Rumsfeld (screen cap)
- The Alabama Network of Children's Advocacy Centers (screen cap)
- PSARA, a non-profit organization for training cruise travel agents (screen cap)
- Park Place Behavioral Health Care, a non-profit mental health care agency (screen cap)
- The Oklahoma chapter of the American Institute of Building Design (screen cap)
- The Boys & Girls Clubs of Metropolitan Phoenix (screen cap)
- CEMTACH -- Computational ElectroMagnetics Theory-Algorithm-Code-Hardware. "Our goal is to develop systems simulations capabilities based on time-domain computational electromagnetics methods." Thanks for clearing that up. (screen cap)
- Fund for Humanity, a San Francisco non-profit supporting environmental organizations and organizations that assist the poor. (screen cap)
A long-standing point of contention while earlier reports about Bess were coming out, was whether every site on their blacklist had been reviewed by a human before being blocked. In 1998 the CEO testified before Congress that "All sites that are blocked are reviewed by N2H2 staff before being added to the block lists." However in their 2002 annual report the company finally admitted that not all sites were reviewed before being blocked: "Through automated categorization or human review, Web sites are identified as fitting into one or more of our categories". At one point an N2H2 employee also told me that when one site is blocked, they will often block all sites hosted on that machine or at that IP -- which of course means that those sites are also not reviewed before being blocked. In any case, it's possible to access some of these sites by IP address, such as the BC Art Galleries site via this link, or the or the Rhode Island Coalition Against Domestic Violence via this link -- so if they're not sharing their IP with other sites, that wouldn't explain how they got blocked either. Smartfilter spokesperson Tomo Foote-Lennox said that one other blocked URL that I found, http://www.arbiol.org/, was the result of an experiment N2H2 once did with fully automated website ratings.
Foote-Lennox added, "In general, we find that schools are VERY sensitive to under-blocking. The would rather block a whole lot of useful reference sites to avoid exposing one porn site." Probably true, although keep in mind we're talking about liability issues, not actual moral outrage. (If they were really morally outraged, they'd be trying to keep kids away from uncensored Internet access everywhere, not just in school! That is in fact the approach that schools take with things like drugs, which do inspire moral outrage because they really are harmful.) Perhaps what is needed is a law explicitly shielding schools from all liability for what students do or see on the Internet at school, if the faculty had no knowledge of it.
(Obligatory interstitial advertisement for common sense: I still don't see what the big deal is about porn anyway. Ask yourself: Why is it harmful to see a picture of a naked person, or even a picture of people having sex? And try to find an answer to that question that doesn't involve, "Lots of other people think so." That includes all variations like "Our society has determined...", "We as a people have decided...", which are just re-phrasings of "Lots of other people think so." I submit that if you disallow those variations of grownup-peer-pressure as an excuse, most people can't really come up with any reason at all.)
OK, flame-retardant suit off, lab coat back on. Previous reports have listed absurd examples of sites blocked by Bess, and looking at any one of those examples or the ones listed here, I'd say that in terms of public policy discussions -- specifically, whether a blocking software company should be trusted to decide what students can look at -- any one of these blocked sites would be more significant than, say, the blocking of BoingBoing which got so much attention. BoingBoing got blocked because of a non-sexual picture of a bare breast on the cover of one of the books they reviewed -- and in fact they were blocked only in the "nudity" category, which includes only "non-pornographic images of the bare human body". So the block on BoingBoing really only revealed that Secure Computing was a bit heavy-handed. (The real problem is that SmartFilter has the category for non-pornographic nudity blocked by default, even though the CIPA filtering law certainly doesn't require schools to block non-pornographic artistic images!) On the other hand, the fact that EFF Austin and the Rhode Island Coalition Against Domestic Violence are currently blocked as "Pornography", suggests that in many instances the blocking companies have nobody at the controls at all. To focus on stupid-but-not-completely-insane blocks like BoingBoing is letting them off easy.
So why did the laundry lists of blocked sites released over the years never become as widely known as BoingBoing, or the guffaw-inducing examples like "Beaver College", which had to change their name in part because of students reportedly being blocked from accessing their website? I think it's because the news favors a good "punch line" -- a fact that anybody can understand that makes us feel smarter than the computers making these dumb mistakes. "Oh, I get it, it was blocked because it was called Beaver College!" But the "punch line" anecdotes are precisely the ones that let the blocking companies off lightly, because it gives them a plausible-sounding excuse for making an error. On the other hand, when the Rhode Island Coalition Against Domestic Violence gets blocked as "Pornography", that could probably force the blocking company to answer some tough questions if it got more press, but there's no good punch line there, so the story just fizzles.
So, while I'm looking through the rest of the data, let me try and come up with some punch lines for reporters to make these blocked sites newsworthy. OK: Why was GardenMentor.com blocked? To keep kids away from all the dirty bitches and hoes! Get it? Ha ha! Why was the Catalina 380 yachting site blocked from kids? Because teens are too vulnerable to pier pressure! Hey, where are you going?
-
From Bess to Worse
Frequent Slashdot contributor Bennett Haselton writes " From about 1996 to 2003, there were regular reports listing examples of sites stupidly blocked by blocking software. The genre has tapered off recently, probably as a result of the Supreme Court ruling in 2003 that the Children's Internet Protection Act (CIPA) was constitutional, requiring blocking software in schools and libraries that receive federal funds, despite all the evidence of over-blocking presented at the trial. The last high-profile story about a site blocked by blocking software was about the blocking of BoingBoing almost a year ago. But the lack of recent reports on blocking software errors doesn't mean that the software has gotten better." The rest of his essay follows.One product that generated several reports over the years was "Bess, the Internet Retriever" from N2H2, which has since been bought out by Secure Computing, which also makes a blocking program called SmartFilter (the one that blocked BoingBoing) and now sells "SmartFilter, Bess Edition" which uses the same database as Bess. Different organizations and individuals published a series of investigative reports about Bess from 1997 until 2002, listing sites about gay rights, eating disorders, and other subjects that were blocked as "pornography". In Ben Edelman's supplemental report, submitted as testimony in the CIPA trial, he listed examples of erroneously blocked sites that he had reported to N2H2 in his first expert report, and which were still being blocked five months later.
Since Bess represents a set of data points showing how the accuracy of a blocking program can change, or not change, over the years, recently I began testing it again. I didn't know whether to expect it to be better or worse. On the one hand, advances in technology and greater revenue to censorware companies could have caused the software to improve. On the other hand, the number of Web pages, and the rate at which dynamic sites like blogs change content every day, has exploded. The result? I'm still tabulating data, but it looks as if the accuracy rate is roughly the same as it was in 2000, when about 30% of blocked sites were obvious errors. Then and now, I found most of the errors by starting with a large list of URLs culled from search engines and other sources, and simply running them through the software to see what was blocked.
Here is a partial list of some of the questionable categorizations made by Bess; as of this writing, all of the following sites are listed as "Pornography" when you look them up on Secure Computing's Bess lookup form. (This is not just a fluke of the lookup tool; I tested against a copy of the software that all of these sites really were blocked.) The "screen cap" link next to each site links to a snapshot of the results taken from the lookup form (you can check on http://database.n2h2.com/ to see if the page is still returning the same results, although the more obvious errors will probably be fixed after this article is published):
- The Electronic Frontier Foundation, Austin chapter (screen cap)
- Cretans of Houston (screen cap). That's Cretans, as in "people from the island of Crete". Not to be confused with the Cretins of Houston, located here.
- The Rhode Island Coalition Against Domestic Violence (screen cap)
- The website of the public art galleries of British Columbia, Canada (screen cap)
- Rail2000, now the Bay Rail Alliance, a consumer group lobbying for a San Francisco regional rail system (screen cap)
- Rainbow Service Organization, a gay rights advocacy group (screen cap)
- GardenMentors.com, a custom gardening services company in Seattle (screen cap)
- A web site for Catalina 380 series boats (screen cap)
- Open Source ERP, a site promoting open source software for enterprise resource planning and customer relationship management (screen cap)
- The Bryn Mawr Mainliners, a barbershop harmony group (screen cap)
- Timber Trails, an outdoor recreation site (screen cap)
- The MEFTA Institute: "Middle East Free Trade Areas for Business Peace" -- world peace through cheap oil! (screen cap)
- Topple Rummy, a (somewhat out-of-date) site calling for the ouster of Donald Rumsfeld (screen cap)
- The Alabama Network of Children's Advocacy Centers (screen cap)
- PSARA, a non-profit organization for training cruise travel agents (screen cap)
- Park Place Behavioral Health Care, a non-profit mental health care agency (screen cap)
- The Oklahoma chapter of the American Institute of Building Design (screen cap)
- The Boys & Girls Clubs of Metropolitan Phoenix (screen cap)
- CEMTACH -- Computational ElectroMagnetics Theory-Algorithm-Code-Hardware. "Our goal is to develop systems simulations capabilities based on time-domain computational electromagnetics methods." Thanks for clearing that up. (screen cap)
- Fund for Humanity, a San Francisco non-profit supporting environmental organizations and organizations that assist the poor. (screen cap)
A long-standing point of contention while earlier reports about Bess were coming out, was whether every site on their blacklist had been reviewed by a human before being blocked. In 1998 the CEO testified before Congress that "All sites that are blocked are reviewed by N2H2 staff before being added to the block lists." However in their 2002 annual report the company finally admitted that not all sites were reviewed before being blocked: "Through automated categorization or human review, Web sites are identified as fitting into one or more of our categories". At one point an N2H2 employee also told me that when one site is blocked, they will often block all sites hosted on that machine or at that IP -- which of course means that those sites are also not reviewed before being blocked. In any case, it's possible to access some of these sites by IP address, such as the BC Art Galleries site via this link, or the or the Rhode Island Coalition Against Domestic Violence via this link -- so if they're not sharing their IP with other sites, that wouldn't explain how they got blocked either. Smartfilter spokesperson Tomo Foote-Lennox said that one other blocked URL that I found, http://www.arbiol.org/, was the result of an experiment N2H2 once did with fully automated website ratings.
Foote-Lennox added, "In general, we find that schools are VERY sensitive to under-blocking. The would rather block a whole lot of useful reference sites to avoid exposing one porn site." Probably true, although keep in mind we're talking about liability issues, not actual moral outrage. (If they were really morally outraged, they'd be trying to keep kids away from uncensored Internet access everywhere, not just in school! That is in fact the approach that schools take with things like drugs, which do inspire moral outrage because they really are harmful.) Perhaps what is needed is a law explicitly shielding schools from all liability for what students do or see on the Internet at school, if the faculty had no knowledge of it.
(Obligatory interstitial advertisement for common sense: I still don't see what the big deal is about porn anyway. Ask yourself: Why is it harmful to see a picture of a naked person, or even a picture of people having sex? And try to find an answer to that question that doesn't involve, "Lots of other people think so." That includes all variations like "Our society has determined...", "We as a people have decided...", which are just re-phrasings of "Lots of other people think so." I submit that if you disallow those variations of grownup-peer-pressure as an excuse, most people can't really come up with any reason at all.)
OK, flame-retardant suit off, lab coat back on. Previous reports have listed absurd examples of sites blocked by Bess, and looking at any one of those examples or the ones listed here, I'd say that in terms of public policy discussions -- specifically, whether a blocking software company should be trusted to decide what students can look at -- any one of these blocked sites would be more significant than, say, the blocking of BoingBoing which got so much attention. BoingBoing got blocked because of a non-sexual picture of a bare breast on the cover of one of the books they reviewed -- and in fact they were blocked only in the "nudity" category, which includes only "non-pornographic images of the bare human body". So the block on BoingBoing really only revealed that Secure Computing was a bit heavy-handed. (The real problem is that SmartFilter has the category for non-pornographic nudity blocked by default, even though the CIPA filtering law certainly doesn't require schools to block non-pornographic artistic images!) On the other hand, the fact that EFF Austin and the Rhode Island Coalition Against Domestic Violence are currently blocked as "Pornography", suggests that in many instances the blocking companies have nobody at the controls at all. To focus on stupid-but-not-completely-insane blocks like BoingBoing is letting them off easy.
So why did the laundry lists of blocked sites released over the years never become as widely known as BoingBoing, or the guffaw-inducing examples like "Beaver College", which had to change their name in part because of students reportedly being blocked from accessing their website? I think it's because the news favors a good "punch line" -- a fact that anybody can understand that makes us feel smarter than the computers making these dumb mistakes. "Oh, I get it, it was blocked because it was called Beaver College!" But the "punch line" anecdotes are precisely the ones that let the blocking companies off lightly, because it gives them a plausible-sounding excuse for making an error. On the other hand, when the Rhode Island Coalition Against Domestic Violence gets blocked as "Pornography", that could probably force the blocking company to answer some tough questions if it got more press, but there's no good punch line there, so the story just fizzles.
So, while I'm looking through the rest of the data, let me try and come up with some punch lines for reporters to make these blocked sites newsworthy. OK: Why was GardenMentor.com blocked? To keep kids away from all the dirty bitches and hoes! Get it? Ha ha! Why was the Catalina 380 yachting site blocked from kids? Because teens are too vulnerable to pier pressure! Hey, where are you going?
-
From Bess to Worse
Frequent Slashdot contributor Bennett Haselton writes " From about 1996 to 2003, there were regular reports listing examples of sites stupidly blocked by blocking software. The genre has tapered off recently, probably as a result of the Supreme Court ruling in 2003 that the Children's Internet Protection Act (CIPA) was constitutional, requiring blocking software in schools and libraries that receive federal funds, despite all the evidence of over-blocking presented at the trial. The last high-profile story about a site blocked by blocking software was about the blocking of BoingBoing almost a year ago. But the lack of recent reports on blocking software errors doesn't mean that the software has gotten better." The rest of his essay follows.One product that generated several reports over the years was "Bess, the Internet Retriever" from N2H2, which has since been bought out by Secure Computing, which also makes a blocking program called SmartFilter (the one that blocked BoingBoing) and now sells "SmartFilter, Bess Edition" which uses the same database as Bess. Different organizations and individuals published a series of investigative reports about Bess from 1997 until 2002, listing sites about gay rights, eating disorders, and other subjects that were blocked as "pornography". In Ben Edelman's supplemental report, submitted as testimony in the CIPA trial, he listed examples of erroneously blocked sites that he had reported to N2H2 in his first expert report, and which were still being blocked five months later.
Since Bess represents a set of data points showing how the accuracy of a blocking program can change, or not change, over the years, recently I began testing it again. I didn't know whether to expect it to be better or worse. On the one hand, advances in technology and greater revenue to censorware companies could have caused the software to improve. On the other hand, the number of Web pages, and the rate at which dynamic sites like blogs change content every day, has exploded. The result? I'm still tabulating data, but it looks as if the accuracy rate is roughly the same as it was in 2000, when about 30% of blocked sites were obvious errors. Then and now, I found most of the errors by starting with a large list of URLs culled from search engines and other sources, and simply running them through the software to see what was blocked.
Here is a partial list of some of the questionable categorizations made by Bess; as of this writing, all of the following sites are listed as "Pornography" when you look them up on Secure Computing's Bess lookup form. (This is not just a fluke of the lookup tool; I tested against a copy of the software that all of these sites really were blocked.) The "screen cap" link next to each site links to a snapshot of the results taken from the lookup form (you can check on http://database.n2h2.com/ to see if the page is still returning the same results, although the more obvious errors will probably be fixed after this article is published):
- The Electronic Frontier Foundation, Austin chapter (screen cap)
- Cretans of Houston (screen cap). That's Cretans, as in "people from the island of Crete". Not to be confused with the Cretins of Houston, located here.
- The Rhode Island Coalition Against Domestic Violence (screen cap)
- The website of the public art galleries of British Columbia, Canada (screen cap)
- Rail2000, now the Bay Rail Alliance, a consumer group lobbying for a San Francisco regional rail system (screen cap)
- Rainbow Service Organization, a gay rights advocacy group (screen cap)
- GardenMentors.com, a custom gardening services company in Seattle (screen cap)
- A web site for Catalina 380 series boats (screen cap)
- Open Source ERP, a site promoting open source software for enterprise resource planning and customer relationship management (screen cap)
- The Bryn Mawr Mainliners, a barbershop harmony group (screen cap)
- Timber Trails, an outdoor recreation site (screen cap)
- The MEFTA Institute: "Middle East Free Trade Areas for Business Peace" -- world peace through cheap oil! (screen cap)
- Topple Rummy, a (somewhat out-of-date) site calling for the ouster of Donald Rumsfeld (screen cap)
- The Alabama Network of Children's Advocacy Centers (screen cap)
- PSARA, a non-profit organization for training cruise travel agents (screen cap)
- Park Place Behavioral Health Care, a non-profit mental health care agency (screen cap)
- The Oklahoma chapter of the American Institute of Building Design (screen cap)
- The Boys & Girls Clubs of Metropolitan Phoenix (screen cap)
- CEMTACH -- Computational ElectroMagnetics Theory-Algorithm-Code-Hardware. "Our goal is to develop systems simulations capabilities based on time-domain computational electromagnetics methods." Thanks for clearing that up. (screen cap)
- Fund for Humanity, a San Francisco non-profit supporting environmental organizations and organizations that assist the poor. (screen cap)
A long-standing point of contention while earlier reports about Bess were coming out, was whether every site on their blacklist had been reviewed by a human before being blocked. In 1998 the CEO testified before Congress that "All sites that are blocked are reviewed by N2H2 staff before being added to the block lists." However in their 2002 annual report the company finally admitted that not all sites were reviewed before being blocked: "Through automated categorization or human review, Web sites are identified as fitting into one or more of our categories". At one point an N2H2 employee also told me that when one site is blocked, they will often block all sites hosted on that machine or at that IP -- which of course means that those sites are also not reviewed before being blocked. In any case, it's possible to access some of these sites by IP address, such as the BC Art Galleries site via this link, or the or the Rhode Island Coalition Against Domestic Violence via this link -- so if they're not sharing their IP with other sites, that wouldn't explain how they got blocked either. Smartfilter spokesperson Tomo Foote-Lennox said that one other blocked URL that I found, http://www.arbiol.org/, was the result of an experiment N2H2 once did with fully automated website ratings.
Foote-Lennox added, "In general, we find that schools are VERY sensitive to under-blocking. The would rather block a whole lot of useful reference sites to avoid exposing one porn site." Probably true, although keep in mind we're talking about liability issues, not actual moral outrage. (If they were really morally outraged, they'd be trying to keep kids away from uncensored Internet access everywhere, not just in school! That is in fact the approach that schools take with things like drugs, which do inspire moral outrage because they really are harmful.) Perhaps what is needed is a law explicitly shielding schools from all liability for what students do or see on the Internet at school, if the faculty had no knowledge of it.
(Obligatory interstitial advertisement for common sense: I still don't see what the big deal is about porn anyway. Ask yourself: Why is it harmful to see a picture of a naked person, or even a picture of people having sex? And try to find an answer to that question that doesn't involve, "Lots of other people think so." That includes all variations like "Our society has determined...", "We as a people have decided...", which are just re-phrasings of "Lots of other people think so." I submit that if you disallow those variations of grownup-peer-pressure as an excuse, most people can't really come up with any reason at all.)
OK, flame-retardant suit off, lab coat back on. Previous reports have listed absurd examples of sites blocked by Bess, and looking at any one of those examples or the ones listed here, I'd say that in terms of public policy discussions -- specifically, whether a blocking software company should be trusted to decide what students can look at -- any one of these blocked sites would be more significant than, say, the blocking of BoingBoing which got so much attention. BoingBoing got blocked because of a non-sexual picture of a bare breast on the cover of one of the books they reviewed -- and in fact they were blocked only in the "nudity" category, which includes only "non-pornographic images of the bare human body". So the block on BoingBoing really only revealed that Secure Computing was a bit heavy-handed. (The real problem is that SmartFilter has the category for non-pornographic nudity blocked by default, even though the CIPA filtering law certainly doesn't require schools to block non-pornographic artistic images!) On the other hand, the fact that EFF Austin and the Rhode Island Coalition Against Domestic Violence are currently blocked as "Pornography", suggests that in many instances the blocking companies have nobody at the controls at all. To focus on stupid-but-not-completely-insane blocks like BoingBoing is letting them off easy.
So why did the laundry lists of blocked sites released over the years never become as widely known as BoingBoing, or the guffaw-inducing examples like "Beaver College", which had to change their name in part because of students reportedly being blocked from accessing their website? I think it's because the news favors a good "punch line" -- a fact that anybody can understand that makes us feel smarter than the computers making these dumb mistakes. "Oh, I get it, it was blocked because it was called Beaver College!" But the "punch line" anecdotes are precisely the ones that let the blocking companies off lightly, because it gives them a plausible-sounding excuse for making an error. On the other hand, when the Rhode Island Coalition Against Domestic Violence gets blocked as "Pornography", that could probably force the blocking company to answer some tough questions if it got more press, but there's no good punch line there, so the story just fizzles.
So, while I'm looking through the rest of the data, let me try and come up with some punch lines for reporters to make these blocked sites newsworthy. OK: Why was GardenMentor.com blocked? To keep kids away from all the dirty bitches and hoes! Get it? Ha ha! Why was the Catalina 380 yachting site blocked from kids? Because teens are too vulnerable to pier pressure! Hey, where are you going?
-
From Bess to Worse
Frequent Slashdot contributor Bennett Haselton writes " From about 1996 to 2003, there were regular reports listing examples of sites stupidly blocked by blocking software. The genre has tapered off recently, probably as a result of the Supreme Court ruling in 2003 that the Children's Internet Protection Act (CIPA) was constitutional, requiring blocking software in schools and libraries that receive federal funds, despite all the evidence of over-blocking presented at the trial. The last high-profile story about a site blocked by blocking software was about the blocking of BoingBoing almost a year ago. But the lack of recent reports on blocking software errors doesn't mean that the software has gotten better." The rest of his essay follows.One product that generated several reports over the years was "Bess, the Internet Retriever" from N2H2, which has since been bought out by Secure Computing, which also makes a blocking program called SmartFilter (the one that blocked BoingBoing) and now sells "SmartFilter, Bess Edition" which uses the same database as Bess. Different organizations and individuals published a series of investigative reports about Bess from 1997 until 2002, listing sites about gay rights, eating disorders, and other subjects that were blocked as "pornography". In Ben Edelman's supplemental report, submitted as testimony in the CIPA trial, he listed examples of erroneously blocked sites that he had reported to N2H2 in his first expert report, and which were still being blocked five months later.
Since Bess represents a set of data points showing how the accuracy of a blocking program can change, or not change, over the years, recently I began testing it again. I didn't know whether to expect it to be better or worse. On the one hand, advances in technology and greater revenue to censorware companies could have caused the software to improve. On the other hand, the number of Web pages, and the rate at which dynamic sites like blogs change content every day, has exploded. The result? I'm still tabulating data, but it looks as if the accuracy rate is roughly the same as it was in 2000, when about 30% of blocked sites were obvious errors. Then and now, I found most of the errors by starting with a large list of URLs culled from search engines and other sources, and simply running them through the software to see what was blocked.
Here is a partial list of some of the questionable categorizations made by Bess; as of this writing, all of the following sites are listed as "Pornography" when you look them up on Secure Computing's Bess lookup form. (This is not just a fluke of the lookup tool; I tested against a copy of the software that all of these sites really were blocked.) The "screen cap" link next to each site links to a snapshot of the results taken from the lookup form (you can check on http://database.n2h2.com/ to see if the page is still returning the same results, although the more obvious errors will probably be fixed after this article is published):
- The Electronic Frontier Foundation, Austin chapter (screen cap)
- Cretans of Houston (screen cap). That's Cretans, as in "people from the island of Crete". Not to be confused with the Cretins of Houston, located here.
- The Rhode Island Coalition Against Domestic Violence (screen cap)
- The website of the public art galleries of British Columbia, Canada (screen cap)
- Rail2000, now the Bay Rail Alliance, a consumer group lobbying for a San Francisco regional rail system (screen cap)
- Rainbow Service Organization, a gay rights advocacy group (screen cap)
- GardenMentors.com, a custom gardening services company in Seattle (screen cap)
- A web site for Catalina 380 series boats (screen cap)
- Open Source ERP, a site promoting open source software for enterprise resource planning and customer relationship management (screen cap)
- The Bryn Mawr Mainliners, a barbershop harmony group (screen cap)
- Timber Trails, an outdoor recreation site (screen cap)
- The MEFTA Institute: "Middle East Free Trade Areas for Business Peace" -- world peace through cheap oil! (screen cap)
- Topple Rummy, a (somewhat out-of-date) site calling for the ouster of Donald Rumsfeld (screen cap)
- The Alabama Network of Children's Advocacy Centers (screen cap)
- PSARA, a non-profit organization for training cruise travel agents (screen cap)
- Park Place Behavioral Health Care, a non-profit mental health care agency (screen cap)
- The Oklahoma chapter of the American Institute of Building Design (screen cap)
- The Boys & Girls Clubs of Metropolitan Phoenix (screen cap)
- CEMTACH -- Computational ElectroMagnetics Theory-Algorithm-Code-Hardware. "Our goal is to develop systems simulations capabilities based on time-domain computational electromagnetics methods." Thanks for clearing that up. (screen cap)
- Fund for Humanity, a San Francisco non-profit supporting environmental organizations and organizations that assist the poor. (screen cap)
A long-standing point of contention while earlier reports about Bess were coming out, was whether every site on their blacklist had been reviewed by a human before being blocked. In 1998 the CEO testified before Congress that "All sites that are blocked are reviewed by N2H2 staff before being added to the block lists." However in their 2002 annual report the company finally admitted that not all sites were reviewed before being blocked: "Through automated categorization or human review, Web sites are identified as fitting into one or more of our categories". At one point an N2H2 employee also told me that when one site is blocked, they will often block all sites hosted on that machine or at that IP -- which of course means that those sites are also not reviewed before being blocked. In any case, it's possible to access some of these sites by IP address, such as the BC Art Galleries site via this link, or the or the Rhode Island Coalition Against Domestic Violence via this link -- so if they're not sharing their IP with other sites, that wouldn't explain how they got blocked either. Smartfilter spokesperson Tomo Foote-Lennox said that one other blocked URL that I found, http://www.arbiol.org/, was the result of an experiment N2H2 once did with fully automated website ratings.
Foote-Lennox added, "In general, we find that schools are VERY sensitive to under-blocking. The would rather block a whole lot of useful reference sites to avoid exposing one porn site." Probably true, although keep in mind we're talking about liability issues, not actual moral outrage. (If they were really morally outraged, they'd be trying to keep kids away from uncensored Internet access everywhere, not just in school! That is in fact the approach that schools take with things like drugs, which do inspire moral outrage because they really are harmful.) Perhaps what is needed is a law explicitly shielding schools from all liability for what students do or see on the Internet at school, if the faculty had no knowledge of it.
(Obligatory interstitial advertisement for common sense: I still don't see what the big deal is about porn anyway. Ask yourself: Why is it harmful to see a picture of a naked person, or even a picture of people having sex? And try to find an answer to that question that doesn't involve, "Lots of other people think so." That includes all variations like "Our society has determined...", "We as a people have decided...", which are just re-phrasings of "Lots of other people think so." I submit that if you disallow those variations of grownup-peer-pressure as an excuse, most people can't really come up with any reason at all.)
OK, flame-retardant suit off, lab coat back on. Previous reports have listed absurd examples of sites blocked by Bess, and looking at any one of those examples or the ones listed here, I'd say that in terms of public policy discussions -- specifically, whether a blocking software company should be trusted to decide what students can look at -- any one of these blocked sites would be more significant than, say, the blocking of BoingBoing which got so much attention. BoingBoing got blocked because of a non-sexual picture of a bare breast on the cover of one of the books they reviewed -- and in fact they were blocked only in the "nudity" category, which includes only "non-pornographic images of the bare human body". So the block on BoingBoing really only revealed that Secure Computing was a bit heavy-handed. (The real problem is that SmartFilter has the category for non-pornographic nudity blocked by default, even though the CIPA filtering law certainly doesn't require schools to block non-pornographic artistic images!) On the other hand, the fact that EFF Austin and the Rhode Island Coalition Against Domestic Violence are currently blocked as "Pornography", suggests that in many instances the blocking companies have nobody at the controls at all. To focus on stupid-but-not-completely-insane blocks like BoingBoing is letting them off easy.
So why did the laundry lists of blocked sites released over the years never become as widely known as BoingBoing, or the guffaw-inducing examples like "Beaver College", which had to change their name in part because of students reportedly being blocked from accessing their website? I think it's because the news favors a good "punch line" -- a fact that anybody can understand that makes us feel smarter than the computers making these dumb mistakes. "Oh, I get it, it was blocked because it was called Beaver College!" But the "punch line" anecdotes are precisely the ones that let the blocking companies off lightly, because it gives them a plausible-sounding excuse for making an error. On the other hand, when the Rhode Island Coalition Against Domestic Violence gets blocked as "Pornography", that could probably force the blocking company to answer some tough questions if it got more press, but there's no good punch line there, so the story just fizzles.
So, while I'm looking through the rest of the data, let me try and come up with some punch lines for reporters to make these blocked sites newsworthy. OK: Why was GardenMentor.com blocked? To keep kids away from all the dirty bitches and hoes! Get it? Ha ha! Why was the Catalina 380 yachting site blocked from kids? Because teens are too vulnerable to pier pressure! Hey, where are you going?
-
From Bess to Worse
Frequent Slashdot contributor Bennett Haselton writes " From about 1996 to 2003, there were regular reports listing examples of sites stupidly blocked by blocking software. The genre has tapered off recently, probably as a result of the Supreme Court ruling in 2003 that the Children's Internet Protection Act (CIPA) was constitutional, requiring blocking software in schools and libraries that receive federal funds, despite all the evidence of over-blocking presented at the trial. The last high-profile story about a site blocked by blocking software was about the blocking of BoingBoing almost a year ago. But the lack of recent reports on blocking software errors doesn't mean that the software has gotten better." The rest of his essay follows.One product that generated several reports over the years was "Bess, the Internet Retriever" from N2H2, which has since been bought out by Secure Computing, which also makes a blocking program called SmartFilter (the one that blocked BoingBoing) and now sells "SmartFilter, Bess Edition" which uses the same database as Bess. Different organizations and individuals published a series of investigative reports about Bess from 1997 until 2002, listing sites about gay rights, eating disorders, and other subjects that were blocked as "pornography". In Ben Edelman's supplemental report, submitted as testimony in the CIPA trial, he listed examples of erroneously blocked sites that he had reported to N2H2 in his first expert report, and which were still being blocked five months later.
Since Bess represents a set of data points showing how the accuracy of a blocking program can change, or not change, over the years, recently I began testing it again. I didn't know whether to expect it to be better or worse. On the one hand, advances in technology and greater revenue to censorware companies could have caused the software to improve. On the other hand, the number of Web pages, and the rate at which dynamic sites like blogs change content every day, has exploded. The result? I'm still tabulating data, but it looks as if the accuracy rate is roughly the same as it was in 2000, when about 30% of blocked sites were obvious errors. Then and now, I found most of the errors by starting with a large list of URLs culled from search engines and other sources, and simply running them through the software to see what was blocked.
Here is a partial list of some of the questionable categorizations made by Bess; as of this writing, all of the following sites are listed as "Pornography" when you look them up on Secure Computing's Bess lookup form. (This is not just a fluke of the lookup tool; I tested against a copy of the software that all of these sites really were blocked.) The "screen cap" link next to each site links to a snapshot of the results taken from the lookup form (you can check on http://database.n2h2.com/ to see if the page is still returning the same results, although the more obvious errors will probably be fixed after this article is published):
- The Electronic Frontier Foundation, Austin chapter (screen cap)
- Cretans of Houston (screen cap). That's Cretans, as in "people from the island of Crete". Not to be confused with the Cretins of Houston, located here.
- The Rhode Island Coalition Against Domestic Violence (screen cap)
- The website of the public art galleries of British Columbia, Canada (screen cap)
- Rail2000, now the Bay Rail Alliance, a consumer group lobbying for a San Francisco regional rail system (screen cap)
- Rainbow Service Organization, a gay rights advocacy group (screen cap)
- GardenMentors.com, a custom gardening services company in Seattle (screen cap)
- A web site for Catalina 380 series boats (screen cap)
- Open Source ERP, a site promoting open source software for enterprise resource planning and customer relationship management (screen cap)
- The Bryn Mawr Mainliners, a barbershop harmony group (screen cap)
- Timber Trails, an outdoor recreation site (screen cap)
- The MEFTA Institute: "Middle East Free Trade Areas for Business Peace" -- world peace through cheap oil! (screen cap)
- Topple Rummy, a (somewhat out-of-date) site calling for the ouster of Donald Rumsfeld (screen cap)
- The Alabama Network of Children's Advocacy Centers (screen cap)
- PSARA, a non-profit organization for training cruise travel agents (screen cap)
- Park Place Behavioral Health Care, a non-profit mental health care agency (screen cap)
- The Oklahoma chapter of the American Institute of Building Design (screen cap)
- The Boys & Girls Clubs of Metropolitan Phoenix (screen cap)
- CEMTACH -- Computational ElectroMagnetics Theory-Algorithm-Code-Hardware. "Our goal is to develop systems simulations capabilities based on time-domain computational electromagnetics methods." Thanks for clearing that up. (screen cap)
- Fund for Humanity, a San Francisco non-profit supporting environmental organizations and organizations that assist the poor. (screen cap)
A long-standing point of contention while earlier reports about Bess were coming out, was whether every site on their blacklist had been reviewed by a human before being blocked. In 1998 the CEO testified before Congress that "All sites that are blocked are reviewed by N2H2 staff before being added to the block lists." However in their 2002 annual report the company finally admitted that not all sites were reviewed before being blocked: "Through automated categorization or human review, Web sites are identified as fitting into one or more of our categories". At one point an N2H2 employee also told me that when one site is blocked, they will often block all sites hosted on that machine or at that IP -- which of course means that those sites are also not reviewed before being blocked. In any case, it's possible to access some of these sites by IP address, such as the BC Art Galleries site via this link, or the or the Rhode Island Coalition Against Domestic Violence via this link -- so if they're not sharing their IP with other sites, that wouldn't explain how they got blocked either. Smartfilter spokesperson Tomo Foote-Lennox said that one other blocked URL that I found, http://www.arbiol.org/, was the result of an experiment N2H2 once did with fully automated website ratings.
Foote-Lennox added, "In general, we find that schools are VERY sensitive to under-blocking. The would rather block a whole lot of useful reference sites to avoid exposing one porn site." Probably true, although keep in mind we're talking about liability issues, not actual moral outrage. (If they were really morally outraged, they'd be trying to keep kids away from uncensored Internet access everywhere, not just in school! That is in fact the approach that schools take with things like drugs, which do inspire moral outrage because they really are harmful.) Perhaps what is needed is a law explicitly shielding schools from all liability for what students do or see on the Internet at school, if the faculty had no knowledge of it.
(Obligatory interstitial advertisement for common sense: I still don't see what the big deal is about porn anyway. Ask yourself: Why is it harmful to see a picture of a naked person, or even a picture of people having sex? And try to find an answer to that question that doesn't involve, "Lots of other people think so." That includes all variations like "Our society has determined...", "We as a people have decided...", which are just re-phrasings of "Lots of other people think so." I submit that if you disallow those variations of grownup-peer-pressure as an excuse, most people can't really come up with any reason at all.)
OK, flame-retardant suit off, lab coat back on. Previous reports have listed absurd examples of sites blocked by Bess, and looking at any one of those examples or the ones listed here, I'd say that in terms of public policy discussions -- specifically, whether a blocking software company should be trusted to decide what students can look at -- any one of these blocked sites would be more significant than, say, the blocking of BoingBoing which got so much attention. BoingBoing got blocked because of a non-sexual picture of a bare breast on the cover of one of the books they reviewed -- and in fact they were blocked only in the "nudity" category, which includes only "non-pornographic images of the bare human body". So the block on BoingBoing really only revealed that Secure Computing was a bit heavy-handed. (The real problem is that SmartFilter has the category for non-pornographic nudity blocked by default, even though the CIPA filtering law certainly doesn't require schools to block non-pornographic artistic images!) On the other hand, the fact that EFF Austin and the Rhode Island Coalition Against Domestic Violence are currently blocked as "Pornography", suggests that in many instances the blocking companies have nobody at the controls at all. To focus on stupid-but-not-completely-insane blocks like BoingBoing is letting them off easy.
So why did the laundry lists of blocked sites released over the years never become as widely known as BoingBoing, or the guffaw-inducing examples like "Beaver College", which had to change their name in part because of students reportedly being blocked from accessing their website? I think it's because the news favors a good "punch line" -- a fact that anybody can understand that makes us feel smarter than the computers making these dumb mistakes. "Oh, I get it, it was blocked because it was called Beaver College!" But the "punch line" anecdotes are precisely the ones that let the blocking companies off lightly, because it gives them a plausible-sounding excuse for making an error. On the other hand, when the Rhode Island Coalition Against Domestic Violence gets blocked as "Pornography", that could probably force the blocking company to answer some tough questions if it got more press, but there's no good punch line there, so the story just fizzles.
So, while I'm looking through the rest of the data, let me try and come up with some punch lines for reporters to make these blocked sites newsworthy. OK: Why was GardenMentor.com blocked? To keep kids away from all the dirty bitches and hoes! Get it? Ha ha! Why was the Catalina 380 yachting site blocked from kids? Because teens are too vulnerable to pier pressure! Hey, where are you going?
-
From Bess to Worse
Frequent Slashdot contributor Bennett Haselton writes " From about 1996 to 2003, there were regular reports listing examples of sites stupidly blocked by blocking software. The genre has tapered off recently, probably as a result of the Supreme Court ruling in 2003 that the Children's Internet Protection Act (CIPA) was constitutional, requiring blocking software in schools and libraries that receive federal funds, despite all the evidence of over-blocking presented at the trial. The last high-profile story about a site blocked by blocking software was about the blocking of BoingBoing almost a year ago. But the lack of recent reports on blocking software errors doesn't mean that the software has gotten better." The rest of his essay follows.One product that generated several reports over the years was "Bess, the Internet Retriever" from N2H2, which has since been bought out by Secure Computing, which also makes a blocking program called SmartFilter (the one that blocked BoingBoing) and now sells "SmartFilter, Bess Edition" which uses the same database as Bess. Different organizations and individuals published a series of investigative reports about Bess from 1997 until 2002, listing sites about gay rights, eating disorders, and other subjects that were blocked as "pornography". In Ben Edelman's supplemental report, submitted as testimony in the CIPA trial, he listed examples of erroneously blocked sites that he had reported to N2H2 in his first expert report, and which were still being blocked five months later.
Since Bess represents a set of data points showing how the accuracy of a blocking program can change, or not change, over the years, recently I began testing it again. I didn't know whether to expect it to be better or worse. On the one hand, advances in technology and greater revenue to censorware companies could have caused the software to improve. On the other hand, the number of Web pages, and the rate at which dynamic sites like blogs change content every day, has exploded. The result? I'm still tabulating data, but it looks as if the accuracy rate is roughly the same as it was in 2000, when about 30% of blocked sites were obvious errors. Then and now, I found most of the errors by starting with a large list of URLs culled from search engines and other sources, and simply running them through the software to see what was blocked.
Here is a partial list of some of the questionable categorizations made by Bess; as of this writing, all of the following sites are listed as "Pornography" when you look them up on Secure Computing's Bess lookup form. (This is not just a fluke of the lookup tool; I tested against a copy of the software that all of these sites really were blocked.) The "screen cap" link next to each site links to a snapshot of the results taken from the lookup form (you can check on http://database.n2h2.com/ to see if the page is still returning the same results, although the more obvious errors will probably be fixed after this article is published):
- The Electronic Frontier Foundation, Austin chapter (screen cap)
- Cretans of Houston (screen cap). That's Cretans, as in "people from the island of Crete". Not to be confused with the Cretins of Houston, located here.
- The Rhode Island Coalition Against Domestic Violence (screen cap)
- The website of the public art galleries of British Columbia, Canada (screen cap)
- Rail2000, now the Bay Rail Alliance, a consumer group lobbying for a San Francisco regional rail system (screen cap)
- Rainbow Service Organization, a gay rights advocacy group (screen cap)
- GardenMentors.com, a custom gardening services company in Seattle (screen cap)
- A web site for Catalina 380 series boats (screen cap)
- Open Source ERP, a site promoting open source software for enterprise resource planning and customer relationship management (screen cap)
- The Bryn Mawr Mainliners, a barbershop harmony group (screen cap)
- Timber Trails, an outdoor recreation site (screen cap)
- The MEFTA Institute: "Middle East Free Trade Areas for Business Peace" -- world peace through cheap oil! (screen cap)
- Topple Rummy, a (somewhat out-of-date) site calling for the ouster of Donald Rumsfeld (screen cap)
- The Alabama Network of Children's Advocacy Centers (screen cap)
- PSARA, a non-profit organization for training cruise travel agents (screen cap)
- Park Place Behavioral Health Care, a non-profit mental health care agency (screen cap)
- The Oklahoma chapter of the American Institute of Building Design (screen cap)
- The Boys & Girls Clubs of Metropolitan Phoenix (screen cap)
- CEMTACH -- Computational ElectroMagnetics Theory-Algorithm-Code-Hardware. "Our goal is to develop systems simulations capabilities based on time-domain computational electromagnetics methods." Thanks for clearing that up. (screen cap)
- Fund for Humanity, a San Francisco non-profit supporting environmental organizations and organizations that assist the poor. (screen cap)
A long-standing point of contention while earlier reports about Bess were coming out, was whether every site on their blacklist had been reviewed by a human before being blocked. In 1998 the CEO testified before Congress that "All sites that are blocked are reviewed by N2H2 staff before being added to the block lists." However in their 2002 annual report the company finally admitted that not all sites were reviewed before being blocked: "Through automated categorization or human review, Web sites are identified as fitting into one or more of our categories". At one point an N2H2 employee also told me that when one site is blocked, they will often block all sites hosted on that machine or at that IP -- which of course means that those sites are also not reviewed before being blocked. In any case, it's possible to access some of these sites by IP address, such as the BC Art Galleries site via this link, or the or the Rhode Island Coalition Against Domestic Violence via this link -- so if they're not sharing their IP with other sites, that wouldn't explain how they got blocked either. Smartfilter spokesperson Tomo Foote-Lennox said that one other blocked URL that I found, http://www.arbiol.org/, was the result of an experiment N2H2 once did with fully automated website ratings.
Foote-Lennox added, "In general, we find that schools are VERY sensitive to under-blocking. The would rather block a whole lot of useful reference sites to avoid exposing one porn site." Probably true, although keep in mind we're talking about liability issues, not actual moral outrage. (If they were really morally outraged, they'd be trying to keep kids away from uncensored Internet access everywhere, not just in school! That is in fact the approach that schools take with things like drugs, which do inspire moral outrage because they really are harmful.) Perhaps what is needed is a law explicitly shielding schools from all liability for what students do or see on the Internet at school, if the faculty had no knowledge of it.
(Obligatory interstitial advertisement for common sense: I still don't see what the big deal is about porn anyway. Ask yourself: Why is it harmful to see a picture of a naked person, or even a picture of people having sex? And try to find an answer to that question that doesn't involve, "Lots of other people think so." That includes all variations like "Our society has determined...", "We as a people have decided...", which are just re-phrasings of "Lots of other people think so." I submit that if you disallow those variations of grownup-peer-pressure as an excuse, most people can't really come up with any reason at all.)
OK, flame-retardant suit off, lab coat back on. Previous reports have listed absurd examples of sites blocked by Bess, and looking at any one of those examples or the ones listed here, I'd say that in terms of public policy discussions -- specifically, whether a blocking software company should be trusted to decide what students can look at -- any one of these blocked sites would be more significant than, say, the blocking of BoingBoing which got so much attention. BoingBoing got blocked because of a non-sexual picture of a bare breast on the cover of one of the books they reviewed -- and in fact they were blocked only in the "nudity" category, which includes only "non-pornographic images of the bare human body". So the block on BoingBoing really only revealed that Secure Computing was a bit heavy-handed. (The real problem is that SmartFilter has the category for non-pornographic nudity blocked by default, even though the CIPA filtering law certainly doesn't require schools to block non-pornographic artistic images!) On the other hand, the fact that EFF Austin and the Rhode Island Coalition Against Domestic Violence are currently blocked as "Pornography", suggests that in many instances the blocking companies have nobody at the controls at all. To focus on stupid-but-not-completely-insane blocks like BoingBoing is letting them off easy.
So why did the laundry lists of blocked sites released over the years never become as widely known as BoingBoing, or the guffaw-inducing examples like "Beaver College", which had to change their name in part because of students reportedly being blocked from accessing their website? I think it's because the news favors a good "punch line" -- a fact that anybody can understand that makes us feel smarter than the computers making these dumb mistakes. "Oh, I get it, it was blocked because it was called Beaver College!" But the "punch line" anecdotes are precisely the ones that let the blocking companies off lightly, because it gives them a plausible-sounding excuse for making an error. On the other hand, when the Rhode Island Coalition Against Domestic Violence gets blocked as "Pornography", that could probably force the blocking company to answer some tough questions if it got more press, but there's no good punch line there, so the story just fizzles.
So, while I'm looking through the rest of the data, let me try and come up with some punch lines for reporters to make these blocked sites newsworthy. OK: Why was GardenMentor.com blocked? To keep kids away from all the dirty bitches and hoes! Get it? Ha ha! Why was the Catalina 380 yachting site blocked from kids? Because teens are too vulnerable to pier pressure! Hey, where are you going?
-
From Bess to Worse
Frequent Slashdot contributor Bennett Haselton writes " From about 1996 to 2003, there were regular reports listing examples of sites stupidly blocked by blocking software. The genre has tapered off recently, probably as a result of the Supreme Court ruling in 2003 that the Children's Internet Protection Act (CIPA) was constitutional, requiring blocking software in schools and libraries that receive federal funds, despite all the evidence of over-blocking presented at the trial. The last high-profile story about a site blocked by blocking software was about the blocking of BoingBoing almost a year ago. But the lack of recent reports on blocking software errors doesn't mean that the software has gotten better." The rest of his essay follows.One product that generated several reports over the years was "Bess, the Internet Retriever" from N2H2, which has since been bought out by Secure Computing, which also makes a blocking program called SmartFilter (the one that blocked BoingBoing) and now sells "SmartFilter, Bess Edition" which uses the same database as Bess. Different organizations and individuals published a series of investigative reports about Bess from 1997 until 2002, listing sites about gay rights, eating disorders, and other subjects that were blocked as "pornography". In Ben Edelman's supplemental report, submitted as testimony in the CIPA trial, he listed examples of erroneously blocked sites that he had reported to N2H2 in his first expert report, and which were still being blocked five months later.
Since Bess represents a set of data points showing how the accuracy of a blocking program can change, or not change, over the years, recently I began testing it again. I didn't know whether to expect it to be better or worse. On the one hand, advances in technology and greater revenue to censorware companies could have caused the software to improve. On the other hand, the number of Web pages, and the rate at which dynamic sites like blogs change content every day, has exploded. The result? I'm still tabulating data, but it looks as if the accuracy rate is roughly the same as it was in 2000, when about 30% of blocked sites were obvious errors. Then and now, I found most of the errors by starting with a large list of URLs culled from search engines and other sources, and simply running them through the software to see what was blocked.
Here is a partial list of some of the questionable categorizations made by Bess; as of this writing, all of the following sites are listed as "Pornography" when you look them up on Secure Computing's Bess lookup form. (This is not just a fluke of the lookup tool; I tested against a copy of the software that all of these sites really were blocked.) The "screen cap" link next to each site links to a snapshot of the results taken from the lookup form (you can check on http://database.n2h2.com/ to see if the page is still returning the same results, although the more obvious errors will probably be fixed after this article is published):
- The Electronic Frontier Foundation, Austin chapter (screen cap)
- Cretans of Houston (screen cap). That's Cretans, as in "people from the island of Crete". Not to be confused with the Cretins of Houston, located here.
- The Rhode Island Coalition Against Domestic Violence (screen cap)
- The website of the public art galleries of British Columbia, Canada (screen cap)
- Rail2000, now the Bay Rail Alliance, a consumer group lobbying for a San Francisco regional rail system (screen cap)
- Rainbow Service Organization, a gay rights advocacy group (screen cap)
- GardenMentors.com, a custom gardening services company in Seattle (screen cap)
- A web site for Catalina 380 series boats (screen cap)
- Open Source ERP, a site promoting open source software for enterprise resource planning and customer relationship management (screen cap)
- The Bryn Mawr Mainliners, a barbershop harmony group (screen cap)
- Timber Trails, an outdoor recreation site (screen cap)
- The MEFTA Institute: "Middle East Free Trade Areas for Business Peace" -- world peace through cheap oil! (screen cap)
- Topple Rummy, a (somewhat out-of-date) site calling for the ouster of Donald Rumsfeld (screen cap)
- The Alabama Network of Children's Advocacy Centers (screen cap)
- PSARA, a non-profit organization for training cruise travel agents (screen cap)
- Park Place Behavioral Health Care, a non-profit mental health care agency (screen cap)
- The Oklahoma chapter of the American Institute of Building Design (screen cap)
- The Boys & Girls Clubs of Metropolitan Phoenix (screen cap)
- CEMTACH -- Computational ElectroMagnetics Theory-Algorithm-Code-Hardware. "Our goal is to develop systems simulations capabilities based on time-domain computational electromagnetics methods." Thanks for clearing that up. (screen cap)
- Fund for Humanity, a San Francisco non-profit supporting environmental organizations and organizations that assist the poor. (screen cap)
A long-standing point of contention while earlier reports about Bess were coming out, was whether every site on their blacklist had been reviewed by a human before being blocked. In 1998 the CEO testified before Congress that "All sites that are blocked are reviewed by N2H2 staff before being added to the block lists." However in their 2002 annual report the company finally admitted that not all sites were reviewed before being blocked: "Through automated categorization or human review, Web sites are identified as fitting into one or more of our categories". At one point an N2H2 employee also told me that when one site is blocked, they will often block all sites hosted on that machine or at that IP -- which of course means that those sites are also not reviewed before being blocked. In any case, it's possible to access some of these sites by IP address, such as the BC Art Galleries site via this link, or the or the Rhode Island Coalition Against Domestic Violence via this link -- so if they're not sharing their IP with other sites, that wouldn't explain how they got blocked either. Smartfilter spokesperson Tomo Foote-Lennox said that one other blocked URL that I found, http://www.arbiol.org/, was the result of an experiment N2H2 once did with fully automated website ratings.
Foote-Lennox added, "In general, we find that schools are VERY sensitive to under-blocking. The would rather block a whole lot of useful reference sites to avoid exposing one porn site." Probably true, although keep in mind we're talking about liability issues, not actual moral outrage. (If they were really morally outraged, they'd be trying to keep kids away from uncensored Internet access everywhere, not just in school! That is in fact the approach that schools take with things like drugs, which do inspire moral outrage because they really are harmful.) Perhaps what is needed is a law explicitly shielding schools from all liability for what students do or see on the Internet at school, if the faculty had no knowledge of it.
(Obligatory interstitial advertisement for common sense: I still don't see what the big deal is about porn anyway. Ask yourself: Why is it harmful to see a picture of a naked person, or even a picture of people having sex? And try to find an answer to that question that doesn't involve, "Lots of other people think so." That includes all variations like "Our society has determined...", "We as a people have decided...", which are just re-phrasings of "Lots of other people think so." I submit that if you disallow those variations of grownup-peer-pressure as an excuse, most people can't really come up with any reason at all.)
OK, flame-retardant suit off, lab coat back on. Previous reports have listed absurd examples of sites blocked by Bess, and looking at any one of those examples or the ones listed here, I'd say that in terms of public policy discussions -- specifically, whether a blocking software company should be trusted to decide what students can look at -- any one of these blocked sites would be more significant than, say, the blocking of BoingBoing which got so much attention. BoingBoing got blocked because of a non-sexual picture of a bare breast on the cover of one of the books they reviewed -- and in fact they were blocked only in the "nudity" category, which includes only "non-pornographic images of the bare human body". So the block on BoingBoing really only revealed that Secure Computing was a bit heavy-handed. (The real problem is that SmartFilter has the category for non-pornographic nudity blocked by default, even though the CIPA filtering law certainly doesn't require schools to block non-pornographic artistic images!) On the other hand, the fact that EFF Austin and the Rhode Island Coalition Against Domestic Violence are currently blocked as "Pornography", suggests that in many instances the blocking companies have nobody at the controls at all. To focus on stupid-but-not-completely-insane blocks like BoingBoing is letting them off easy.
So why did the laundry lists of blocked sites released over the years never become as widely known as BoingBoing, or the guffaw-inducing examples like "Beaver College", which had to change their name in part because of students reportedly being blocked from accessing their website? I think it's because the news favors a good "punch line" -- a fact that anybody can understand that makes us feel smarter than the computers making these dumb mistakes. "Oh, I get it, it was blocked because it was called Beaver College!" But the "punch line" anecdotes are precisely the ones that let the blocking companies off lightly, because it gives them a plausible-sounding excuse for making an error. On the other hand, when the Rhode Island Coalition Against Domestic Violence gets blocked as "Pornography", that could probably force the blocking company to answer some tough questions if it got more press, but there's no good punch line there, so the story just fizzles.
So, while I'm looking through the rest of the data, let me try and come up with some punch lines for reporters to make these blocked sites newsworthy. OK: Why was GardenMentor.com blocked? To keep kids away from all the dirty bitches and hoes! Get it? Ha ha! Why was the Catalina 380 yachting site blocked from kids? Because teens are too vulnerable to pier pressure! Hey, where are you going?
-
From Bess to Worse
Frequent Slashdot contributor Bennett Haselton writes " From about 1996 to 2003, there were regular reports listing examples of sites stupidly blocked by blocking software. The genre has tapered off recently, probably as a result of the Supreme Court ruling in 2003 that the Children's Internet Protection Act (CIPA) was constitutional, requiring blocking software in schools and libraries that receive federal funds, despite all the evidence of over-blocking presented at the trial. The last high-profile story about a site blocked by blocking software was about the blocking of BoingBoing almost a year ago. But the lack of recent reports on blocking software errors doesn't mean that the software has gotten better." The rest of his essay follows.One product that generated several reports over the years was "Bess, the Internet Retriever" from N2H2, which has since been bought out by Secure Computing, which also makes a blocking program called SmartFilter (the one that blocked BoingBoing) and now sells "SmartFilter, Bess Edition" which uses the same database as Bess. Different organizations and individuals published a series of investigative reports about Bess from 1997 until 2002, listing sites about gay rights, eating disorders, and other subjects that were blocked as "pornography". In Ben Edelman's supplemental report, submitted as testimony in the CIPA trial, he listed examples of erroneously blocked sites that he had reported to N2H2 in his first expert report, and which were still being blocked five months later.
Since Bess represents a set of data points showing how the accuracy of a blocking program can change, or not change, over the years, recently I began testing it again. I didn't know whether to expect it to be better or worse. On the one hand, advances in technology and greater revenue to censorware companies could have caused the software to improve. On the other hand, the number of Web pages, and the rate at which dynamic sites like blogs change content every day, has exploded. The result? I'm still tabulating data, but it looks as if the accuracy rate is roughly the same as it was in 2000, when about 30% of blocked sites were obvious errors. Then and now, I found most of the errors by starting with a large list of URLs culled from search engines and other sources, and simply running them through the software to see what was blocked.
Here is a partial list of some of the questionable categorizations made by Bess; as of this writing, all of the following sites are listed as "Pornography" when you look them up on Secure Computing's Bess lookup form. (This is not just a fluke of the lookup tool; I tested against a copy of the software that all of these sites really were blocked.) The "screen cap" link next to each site links to a snapshot of the results taken from the lookup form (you can check on http://database.n2h2.com/ to see if the page is still returning the same results, although the more obvious errors will probably be fixed after this article is published):
- The Electronic Frontier Foundation, Austin chapter (screen cap)
- Cretans of Houston (screen cap). That's Cretans, as in "people from the island of Crete". Not to be confused with the Cretins of Houston, located here.
- The Rhode Island Coalition Against Domestic Violence (screen cap)
- The website of the public art galleries of British Columbia, Canada (screen cap)
- Rail2000, now the Bay Rail Alliance, a consumer group lobbying for a San Francisco regional rail system (screen cap)
- Rainbow Service Organization, a gay rights advocacy group (screen cap)
- GardenMentors.com, a custom gardening services company in Seattle (screen cap)
- A web site for Catalina 380 series boats (screen cap)
- Open Source ERP, a site promoting open source software for enterprise resource planning and customer relationship management (screen cap)
- The Bryn Mawr Mainliners, a barbershop harmony group (screen cap)
- Timber Trails, an outdoor recreation site (screen cap)
- The MEFTA Institute: "Middle East Free Trade Areas for Business Peace" -- world peace through cheap oil! (screen cap)
- Topple Rummy, a (somewhat out-of-date) site calling for the ouster of Donald Rumsfeld (screen cap)
- The Alabama Network of Children's Advocacy Centers (screen cap)
- PSARA, a non-profit organization for training cruise travel agents (screen cap)
- Park Place Behavioral Health Care, a non-profit mental health care agency (screen cap)
- The Oklahoma chapter of the American Institute of Building Design (screen cap)
- The Boys & Girls Clubs of Metropolitan Phoenix (screen cap)
- CEMTACH -- Computational ElectroMagnetics Theory-Algorithm-Code-Hardware. "Our goal is to develop systems simulations capabilities based on time-domain computational electromagnetics methods." Thanks for clearing that up. (screen cap)
- Fund for Humanity, a San Francisco non-profit supporting environmental organizations and organizations that assist the poor. (screen cap)
A long-standing point of contention while earlier reports about Bess were coming out, was whether every site on their blacklist had been reviewed by a human before being blocked. In 1998 the CEO testified before Congress that "All sites that are blocked are reviewed by N2H2 staff before being added to the block lists." However in their 2002 annual report the company finally admitted that not all sites were reviewed before being blocked: "Through automated categorization or human review, Web sites are identified as fitting into one or more of our categories". At one point an N2H2 employee also told me that when one site is blocked, they will often block all sites hosted on that machine or at that IP -- which of course means that those sites are also not reviewed before being blocked. In any case, it's possible to access some of these sites by IP address, such as the BC Art Galleries site via this link, or the or the Rhode Island Coalition Against Domestic Violence via this link -- so if they're not sharing their IP with other sites, that wouldn't explain how they got blocked either. Smartfilter spokesperson Tomo Foote-Lennox said that one other blocked URL that I found, http://www.arbiol.org/, was the result of an experiment N2H2 once did with fully automated website ratings.
Foote-Lennox added, "In general, we find that schools are VERY sensitive to under-blocking. The would rather block a whole lot of useful reference sites to avoid exposing one porn site." Probably true, although keep in mind we're talking about liability issues, not actual moral outrage. (If they were really morally outraged, they'd be trying to keep kids away from uncensored Internet access everywhere, not just in school! That is in fact the approach that schools take with things like drugs, which do inspire moral outrage because they really are harmful.) Perhaps what is needed is a law explicitly shielding schools from all liability for what students do or see on the Internet at school, if the faculty had no knowledge of it.
(Obligatory interstitial advertisement for common sense: I still don't see what the big deal is about porn anyway. Ask yourself: Why is it harmful to see a picture of a naked person, or even a picture of people having sex? And try to find an answer to that question that doesn't involve, "Lots of other people think so." That includes all variations like "Our society has determined...", "We as a people have decided...", which are just re-phrasings of "Lots of other people think so." I submit that if you disallow those variations of grownup-peer-pressure as an excuse, most people can't really come up with any reason at all.)
OK, flame-retardant suit off, lab coat back on. Previous reports have listed absurd examples of sites blocked by Bess, and looking at any one of those examples or the ones listed here, I'd say that in terms of public policy discussions -- specifically, whether a blocking software company should be trusted to decide what students can look at -- any one of these blocked sites would be more significant than, say, the blocking of BoingBoing which got so much attention. BoingBoing got blocked because of a non-sexual picture of a bare breast on the cover of one of the books they reviewed -- and in fact they were blocked only in the "nudity" category, which includes only "non-pornographic images of the bare human body". So the block on BoingBoing really only revealed that Secure Computing was a bit heavy-handed. (The real problem is that SmartFilter has the category for non-pornographic nudity blocked by default, even though the CIPA filtering law certainly doesn't require schools to block non-pornographic artistic images!) On the other hand, the fact that EFF Austin and the Rhode Island Coalition Against Domestic Violence are currently blocked as "Pornography", suggests that in many instances the blocking companies have nobody at the controls at all. To focus on stupid-but-not-completely-insane blocks like BoingBoing is letting them off easy.
So why did the laundry lists of blocked sites released over the years never become as widely known as BoingBoing, or the guffaw-inducing examples like "Beaver College", which had to change their name in part because of students reportedly being blocked from accessing their website? I think it's because the news favors a good "punch line" -- a fact that anybody can understand that makes us feel smarter than the computers making these dumb mistakes. "Oh, I get it, it was blocked because it was called Beaver College!" But the "punch line" anecdotes are precisely the ones that let the blocking companies off lightly, because it gives them a plausible-sounding excuse for making an error. On the other hand, when the Rhode Island Coalition Against Domestic Violence gets blocked as "Pornography", that could probably force the blocking company to answer some tough questions if it got more press, but there's no good punch line there, so the story just fizzles.
So, while I'm looking through the rest of the data, let me try and come up with some punch lines for reporters to make these blocked sites newsworthy. OK: Why was GardenMentor.com blocked? To keep kids away from all the dirty bitches and hoes! Get it? Ha ha! Why was the Catalina 380 yachting site blocked from kids? Because teens are too vulnerable to pier pressure! Hey, where are you going?
-
From Bess to Worse
Frequent Slashdot contributor Bennett Haselton writes " From about 1996 to 2003, there were regular reports listing examples of sites stupidly blocked by blocking software. The genre has tapered off recently, probably as a result of the Supreme Court ruling in 2003 that the Children's Internet Protection Act (CIPA) was constitutional, requiring blocking software in schools and libraries that receive federal funds, despite all the evidence of over-blocking presented at the trial. The last high-profile story about a site blocked by blocking software was about the blocking of BoingBoing almost a year ago. But the lack of recent reports on blocking software errors doesn't mean that the software has gotten better." The rest of his essay follows.One product that generated several reports over the years was "Bess, the Internet Retriever" from N2H2, which has since been bought out by Secure Computing, which also makes a blocking program called SmartFilter (the one that blocked BoingBoing) and now sells "SmartFilter, Bess Edition" which uses the same database as Bess. Different organizations and individuals published a series of investigative reports about Bess from 1997 until 2002, listing sites about gay rights, eating disorders, and other subjects that were blocked as "pornography". In Ben Edelman's supplemental report, submitted as testimony in the CIPA trial, he listed examples of erroneously blocked sites that he had reported to N2H2 in his first expert report, and which were still being blocked five months later.
Since Bess represents a set of data points showing how the accuracy of a blocking program can change, or not change, over the years, recently I began testing it again. I didn't know whether to expect it to be better or worse. On the one hand, advances in technology and greater revenue to censorware companies could have caused the software to improve. On the other hand, the number of Web pages, and the rate at which dynamic sites like blogs change content every day, has exploded. The result? I'm still tabulating data, but it looks as if the accuracy rate is roughly the same as it was in 2000, when about 30% of blocked sites were obvious errors. Then and now, I found most of the errors by starting with a large list of URLs culled from search engines and other sources, and simply running them through the software to see what was blocked.
Here is a partial list of some of the questionable categorizations made by Bess; as of this writing, all of the following sites are listed as "Pornography" when you look them up on Secure Computing's Bess lookup form. (This is not just a fluke of the lookup tool; I tested against a copy of the software that all of these sites really were blocked.) The "screen cap" link next to each site links to a snapshot of the results taken from the lookup form (you can check on http://database.n2h2.com/ to see if the page is still returning the same results, although the more obvious errors will probably be fixed after this article is published):
- The Electronic Frontier Foundation, Austin chapter (screen cap)
- Cretans of Houston (screen cap). That's Cretans, as in "people from the island of Crete". Not to be confused with the Cretins of Houston, located here.
- The Rhode Island Coalition Against Domestic Violence (screen cap)
- The website of the public art galleries of British Columbia, Canada (screen cap)
- Rail2000, now the Bay Rail Alliance, a consumer group lobbying for a San Francisco regional rail system (screen cap)
- Rainbow Service Organization, a gay rights advocacy group (screen cap)
- GardenMentors.com, a custom gardening services company in Seattle (screen cap)
- A web site for Catalina 380 series boats (screen cap)
- Open Source ERP, a site promoting open source software for enterprise resource planning and customer relationship management (screen cap)
- The Bryn Mawr Mainliners, a barbershop harmony group (screen cap)
- Timber Trails, an outdoor recreation site (screen cap)
- The MEFTA Institute: "Middle East Free Trade Areas for Business Peace" -- world peace through cheap oil! (screen cap)
- Topple Rummy, a (somewhat out-of-date) site calling for the ouster of Donald Rumsfeld (screen cap)
- The Alabama Network of Children's Advocacy Centers (screen cap)
- PSARA, a non-profit organization for training cruise travel agents (screen cap)
- Park Place Behavioral Health Care, a non-profit mental health care agency (screen cap)
- The Oklahoma chapter of the American Institute of Building Design (screen cap)
- The Boys & Girls Clubs of Metropolitan Phoenix (screen cap)
- CEMTACH -- Computational ElectroMagnetics Theory-Algorithm-Code-Hardware. "Our goal is to develop systems simulations capabilities based on time-domain computational electromagnetics methods." Thanks for clearing that up. (screen cap)
- Fund for Humanity, a San Francisco non-profit supporting environmental organizations and organizations that assist the poor. (screen cap)
A long-standing point of contention while earlier reports about Bess were coming out, was whether every site on their blacklist had been reviewed by a human before being blocked. In 1998 the CEO testified before Congress that "All sites that are blocked are reviewed by N2H2 staff before being added to the block lists." However in their 2002 annual report the company finally admitted that not all sites were reviewed before being blocked: "Through automated categorization or human review, Web sites are identified as fitting into one or more of our categories". At one point an N2H2 employee also told me that when one site is blocked, they will often block all sites hosted on that machine or at that IP -- which of course means that those sites are also not reviewed before being blocked. In any case, it's possible to access some of these sites by IP address, such as the BC Art Galleries site via this link, or the or the Rhode Island Coalition Against Domestic Violence via this link -- so if they're not sharing their IP with other sites, that wouldn't explain how they got blocked either. Smartfilter spokesperson Tomo Foote-Lennox said that one other blocked URL that I found, http://www.arbiol.org/, was the result of an experiment N2H2 once did with fully automated website ratings.
Foote-Lennox added, "In general, we find that schools are VERY sensitive to under-blocking. The would rather block a whole lot of useful reference sites to avoid exposing one porn site." Probably true, although keep in mind we're talking about liability issues, not actual moral outrage. (If they were really morally outraged, they'd be trying to keep kids away from uncensored Internet access everywhere, not just in school! That is in fact the approach that schools take with things like drugs, which do inspire moral outrage because they really are harmful.) Perhaps what is needed is a law explicitly shielding schools from all liability for what students do or see on the Internet at school, if the faculty had no knowledge of it.
(Obligatory interstitial advertisement for common sense: I still don't see what the big deal is about porn anyway. Ask yourself: Why is it harmful to see a picture of a naked person, or even a picture of people having sex? And try to find an answer to that question that doesn't involve, "Lots of other people think so." That includes all variations like "Our society has determined...", "We as a people have decided...", which are just re-phrasings of "Lots of other people think so." I submit that if you disallow those variations of grownup-peer-pressure as an excuse, most people can't really come up with any reason at all.)
OK, flame-retardant suit off, lab coat back on. Previous reports have listed absurd examples of sites blocked by Bess, and looking at any one of those examples or the ones listed here, I'd say that in terms of public policy discussions -- specifically, whether a blocking software company should be trusted to decide what students can look at -- any one of these blocked sites would be more significant than, say, the blocking of BoingBoing which got so much attention. BoingBoing got blocked because of a non-sexual picture of a bare breast on the cover of one of the books they reviewed -- and in fact they were blocked only in the "nudity" category, which includes only "non-pornographic images of the bare human body". So the block on BoingBoing really only revealed that Secure Computing was a bit heavy-handed. (The real problem is that SmartFilter has the category for non-pornographic nudity blocked by default, even though the CIPA filtering law certainly doesn't require schools to block non-pornographic artistic images!) On the other hand, the fact that EFF Austin and the Rhode Island Coalition Against Domestic Violence are currently blocked as "Pornography", suggests that in many instances the blocking companies have nobody at the controls at all. To focus on stupid-but-not-completely-insane blocks like BoingBoing is letting them off easy.
So why did the laundry lists of blocked sites released over the years never become as widely known as BoingBoing, or the guffaw-inducing examples like "Beaver College", which had to change their name in part because of students reportedly being blocked from accessing their website? I think it's because the news favors a good "punch line" -- a fact that anybody can understand that makes us feel smarter than the computers making these dumb mistakes. "Oh, I get it, it was blocked because it was called Beaver College!" But the "punch line" anecdotes are precisely the ones that let the blocking companies off lightly, because it gives them a plausible-sounding excuse for making an error. On the other hand, when the Rhode Island Coalition Against Domestic Violence gets blocked as "Pornography", that could probably force the blocking company to answer some tough questions if it got more press, but there's no good punch line there, so the story just fizzles.
So, while I'm looking through the rest of the data, let me try and come up with some punch lines for reporters to make these blocked sites newsworthy. OK: Why was GardenMentor.com blocked? To keep kids away from all the dirty bitches and hoes! Get it? Ha ha! Why was the Catalina 380 yachting site blocked from kids? Because teens are too vulnerable to pier pressure! Hey, where are you going?
-
From Bess to Worse
Frequent Slashdot contributor Bennett Haselton writes " From about 1996 to 2003, there were regular reports listing examples of sites stupidly blocked by blocking software. The genre has tapered off recently, probably as a result of the Supreme Court ruling in 2003 that the Children's Internet Protection Act (CIPA) was constitutional, requiring blocking software in schools and libraries that receive federal funds, despite all the evidence of over-blocking presented at the trial. The last high-profile story about a site blocked by blocking software was about the blocking of BoingBoing almost a year ago. But the lack of recent reports on blocking software errors doesn't mean that the software has gotten better." The rest of his essay follows.One product that generated several reports over the years was "Bess, the Internet Retriever" from N2H2, which has since been bought out by Secure Computing, which also makes a blocking program called SmartFilter (the one that blocked BoingBoing) and now sells "SmartFilter, Bess Edition" which uses the same database as Bess. Different organizations and individuals published a series of investigative reports about Bess from 1997 until 2002, listing sites about gay rights, eating disorders, and other subjects that were blocked as "pornography". In Ben Edelman's supplemental report, submitted as testimony in the CIPA trial, he listed examples of erroneously blocked sites that he had reported to N2H2 in his first expert report, and which were still being blocked five months later.
Since Bess represents a set of data points showing how the accuracy of a blocking program can change, or not change, over the years, recently I began testing it again. I didn't know whether to expect it to be better or worse. On the one hand, advances in technology and greater revenue to censorware companies could have caused the software to improve. On the other hand, the number of Web pages, and the rate at which dynamic sites like blogs change content every day, has exploded. The result? I'm still tabulating data, but it looks as if the accuracy rate is roughly the same as it was in 2000, when about 30% of blocked sites were obvious errors. Then and now, I found most of the errors by starting with a large list of URLs culled from search engines and other sources, and simply running them through the software to see what was blocked.
Here is a partial list of some of the questionable categorizations made by Bess; as of this writing, all of the following sites are listed as "Pornography" when you look them up on Secure Computing's Bess lookup form. (This is not just a fluke of the lookup tool; I tested against a copy of the software that all of these sites really were blocked.) The "screen cap" link next to each site links to a snapshot of the results taken from the lookup form (you can check on http://database.n2h2.com/ to see if the page is still returning the same results, although the more obvious errors will probably be fixed after this article is published):
- The Electronic Frontier Foundation, Austin chapter (screen cap)
- Cretans of Houston (screen cap). That's Cretans, as in "people from the island of Crete". Not to be confused with the Cretins of Houston, located here.
- The Rhode Island Coalition Against Domestic Violence (screen cap)
- The website of the public art galleries of British Columbia, Canada (screen cap)
- Rail2000, now the Bay Rail Alliance, a consumer group lobbying for a San Francisco regional rail system (screen cap)
- Rainbow Service Organization, a gay rights advocacy group (screen cap)
- GardenMentors.com, a custom gardening services company in Seattle (screen cap)
- A web site for Catalina 380 series boats (screen cap)
- Open Source ERP, a site promoting open source software for enterprise resource planning and customer relationship management (screen cap)
- The Bryn Mawr Mainliners, a barbershop harmony group (screen cap)
- Timber Trails, an outdoor recreation site (screen cap)
- The MEFTA Institute: "Middle East Free Trade Areas for Business Peace" -- world peace through cheap oil! (screen cap)
- Topple Rummy, a (somewhat out-of-date) site calling for the ouster of Donald Rumsfeld (screen cap)
- The Alabama Network of Children's Advocacy Centers (screen cap)
- PSARA, a non-profit organization for training cruise travel agents (screen cap)
- Park Place Behavioral Health Care, a non-profit mental health care agency (screen cap)
- The Oklahoma chapter of the American Institute of Building Design (screen cap)
- The Boys & Girls Clubs of Metropolitan Phoenix (screen cap)
- CEMTACH -- Computational ElectroMagnetics Theory-Algorithm-Code-Hardware. "Our goal is to develop systems simulations capabilities based on time-domain computational electromagnetics methods." Thanks for clearing that up. (screen cap)
- Fund for Humanity, a San Francisco non-profit supporting environmental organizations and organizations that assist the poor. (screen cap)
A long-standing point of contention while earlier reports about Bess were coming out, was whether every site on their blacklist had been reviewed by a human before being blocked. In 1998 the CEO testified before Congress that "All sites that are blocked are reviewed by N2H2 staff before being added to the block lists." However in their 2002 annual report the company finally admitted that not all sites were reviewed before being blocked: "Through automated categorization or human review, Web sites are identified as fitting into one or more of our categories". At one point an N2H2 employee also told me that when one site is blocked, they will often block all sites hosted on that machine or at that IP -- which of course means that those sites are also not reviewed before being blocked. In any case, it's possible to access some of these sites by IP address, such as the BC Art Galleries site via this link, or the or the Rhode Island Coalition Against Domestic Violence via this link -- so if they're not sharing their IP with other sites, that wouldn't explain how they got blocked either. Smartfilter spokesperson Tomo Foote-Lennox said that one other blocked URL that I found, http://www.arbiol.org/, was the result of an experiment N2H2 once did with fully automated website ratings.
Foote-Lennox added, "In general, we find that schools are VERY sensitive to under-blocking. The would rather block a whole lot of useful reference sites to avoid exposing one porn site." Probably true, although keep in mind we're talking about liability issues, not actual moral outrage. (If they were really morally outraged, they'd be trying to keep kids away from uncensored Internet access everywhere, not just in school! That is in fact the approach that schools take with things like drugs, which do inspire moral outrage because they really are harmful.) Perhaps what is needed is a law explicitly shielding schools from all liability for what students do or see on the Internet at school, if the faculty had no knowledge of it.
(Obligatory interstitial advertisement for common sense: I still don't see what the big deal is about porn anyway. Ask yourself: Why is it harmful to see a picture of a naked person, or even a picture of people having sex? And try to find an answer to that question that doesn't involve, "Lots of other people think so." That includes all variations like "Our society has determined...", "We as a people have decided...", which are just re-phrasings of "Lots of other people think so." I submit that if you disallow those variations of grownup-peer-pressure as an excuse, most people can't really come up with any reason at all.)
OK, flame-retardant suit off, lab coat back on. Previous reports have listed absurd examples of sites blocked by Bess, and looking at any one of those examples or the ones listed here, I'd say that in terms of public policy discussions -- specifically, whether a blocking software company should be trusted to decide what students can look at -- any one of these blocked sites would be more significant than, say, the blocking of BoingBoing which got so much attention. BoingBoing got blocked because of a non-sexual picture of a bare breast on the cover of one of the books they reviewed -- and in fact they were blocked only in the "nudity" category, which includes only "non-pornographic images of the bare human body". So the block on BoingBoing really only revealed that Secure Computing was a bit heavy-handed. (The real problem is that SmartFilter has the category for non-pornographic nudity blocked by default, even though the CIPA filtering law certainly doesn't require schools to block non-pornographic artistic images!) On the other hand, the fact that EFF Austin and the Rhode Island Coalition Against Domestic Violence are currently blocked as "Pornography", suggests that in many instances the blocking companies have nobody at the controls at all. To focus on stupid-but-not-completely-insane blocks like BoingBoing is letting them off easy.
So why did the laundry lists of blocked sites released over the years never become as widely known as BoingBoing, or the guffaw-inducing examples like "Beaver College", which had to change their name in part because of students reportedly being blocked from accessing their website? I think it's because the news favors a good "punch line" -- a fact that anybody can understand that makes us feel smarter than the computers making these dumb mistakes. "Oh, I get it, it was blocked because it was called Beaver College!" But the "punch line" anecdotes are precisely the ones that let the blocking companies off lightly, because it gives them a plausible-sounding excuse for making an error. On the other hand, when the Rhode Island Coalition Against Domestic Violence gets blocked as "Pornography", that could probably force the blocking company to answer some tough questions if it got more press, but there's no good punch line there, so the story just fizzles.
So, while I'm looking through the rest of the data, let me try and come up with some punch lines for reporters to make these blocked sites newsworthy. OK: Why was GardenMentor.com blocked? To keep kids away from all the dirty bitches and hoes! Get it? Ha ha! Why was the Catalina 380 yachting site blocked from kids? Because teens are too vulnerable to pier pressure! Hey, where are you going?
-
From Bess to Worse
Frequent Slashdot contributor Bennett Haselton writes " From about 1996 to 2003, there were regular reports listing examples of sites stupidly blocked by blocking software. The genre has tapered off recently, probably as a result of the Supreme Court ruling in 2003 that the Children's Internet Protection Act (CIPA) was constitutional, requiring blocking software in schools and libraries that receive federal funds, despite all the evidence of over-blocking presented at the trial. The last high-profile story about a site blocked by blocking software was about the blocking of BoingBoing almost a year ago. But the lack of recent reports on blocking software errors doesn't mean that the software has gotten better." The rest of his essay follows.One product that generated several reports over the years was "Bess, the Internet Retriever" from N2H2, which has since been bought out by Secure Computing, which also makes a blocking program called SmartFilter (the one that blocked BoingBoing) and now sells "SmartFilter, Bess Edition" which uses the same database as Bess. Different organizations and individuals published a series of investigative reports about Bess from 1997 until 2002, listing sites about gay rights, eating disorders, and other subjects that were blocked as "pornography". In Ben Edelman's supplemental report, submitted as testimony in the CIPA trial, he listed examples of erroneously blocked sites that he had reported to N2H2 in his first expert report, and which were still being blocked five months later.
Since Bess represents a set of data points showing how the accuracy of a blocking program can change, or not change, over the years, recently I began testing it again. I didn't know whether to expect it to be better or worse. On the one hand, advances in technology and greater revenue to censorware companies could have caused the software to improve. On the other hand, the number of Web pages, and the rate at which dynamic sites like blogs change content every day, has exploded. The result? I'm still tabulating data, but it looks as if the accuracy rate is roughly the same as it was in 2000, when about 30% of blocked sites were obvious errors. Then and now, I found most of the errors by starting with a large list of URLs culled from search engines and other sources, and simply running them through the software to see what was blocked.
Here is a partial list of some of the questionable categorizations made by Bess; as of this writing, all of the following sites are listed as "Pornography" when you look them up on Secure Computing's Bess lookup form. (This is not just a fluke of the lookup tool; I tested against a copy of the software that all of these sites really were blocked.) The "screen cap" link next to each site links to a snapshot of the results taken from the lookup form (you can check on http://database.n2h2.com/ to see if the page is still returning the same results, although the more obvious errors will probably be fixed after this article is published):
- The Electronic Frontier Foundation, Austin chapter (screen cap)
- Cretans of Houston (screen cap). That's Cretans, as in "people from the island of Crete". Not to be confused with the Cretins of Houston, located here.
- The Rhode Island Coalition Against Domestic Violence (screen cap)
- The website of the public art galleries of British Columbia, Canada (screen cap)
- Rail2000, now the Bay Rail Alliance, a consumer group lobbying for a San Francisco regional rail system (screen cap)
- Rainbow Service Organization, a gay rights advocacy group (screen cap)
- GardenMentors.com, a custom gardening services company in Seattle (screen cap)
- A web site for Catalina 380 series boats (screen cap)
- Open Source ERP, a site promoting open source software for enterprise resource planning and customer relationship management (screen cap)
- The Bryn Mawr Mainliners, a barbershop harmony group (screen cap)
- Timber Trails, an outdoor recreation site (screen cap)
- The MEFTA Institute: "Middle East Free Trade Areas for Business Peace" -- world peace through cheap oil! (screen cap)
- Topple Rummy, a (somewhat out-of-date) site calling for the ouster of Donald Rumsfeld (screen cap)
- The Alabama Network of Children's Advocacy Centers (screen cap)
- PSARA, a non-profit organization for training cruise travel agents (screen cap)
- Park Place Behavioral Health Care, a non-profit mental health care agency (screen cap)
- The Oklahoma chapter of the American Institute of Building Design (screen cap)
- The Boys & Girls Clubs of Metropolitan Phoenix (screen cap)
- CEMTACH -- Computational ElectroMagnetics Theory-Algorithm-Code-Hardware. "Our goal is to develop systems simulations capabilities based on time-domain computational electromagnetics methods." Thanks for clearing that up. (screen cap)
- Fund for Humanity, a San Francisco non-profit supporting environmental organizations and organizations that assist the poor. (screen cap)
A long-standing point of contention while earlier reports about Bess were coming out, was whether every site on their blacklist had been reviewed by a human before being blocked. In 1998 the CEO testified before Congress that "All sites that are blocked are reviewed by N2H2 staff before being added to the block lists." However in their 2002 annual report the company finally admitted that not all sites were reviewed before being blocked: "Through automated categorization or human review, Web sites are identified as fitting into one or more of our categories". At one point an N2H2 employee also told me that when one site is blocked, they will often block all sites hosted on that machine or at that IP -- which of course means that those sites are also not reviewed before being blocked. In any case, it's possible to access some of these sites by IP address, such as the BC Art Galleries site via this link, or the or the Rhode Island Coalition Against Domestic Violence via this link -- so if they're not sharing their IP with other sites, that wouldn't explain how they got blocked either. Smartfilter spokesperson Tomo Foote-Lennox said that one other blocked URL that I found, http://www.arbiol.org/, was the result of an experiment N2H2 once did with fully automated website ratings.
Foote-Lennox added, "In general, we find that schools are VERY sensitive to under-blocking. The would rather block a whole lot of useful reference sites to avoid exposing one porn site." Probably true, although keep in mind we're talking about liability issues, not actual moral outrage. (If they were really morally outraged, they'd be trying to keep kids away from uncensored Internet access everywhere, not just in school! That is in fact the approach that schools take with things like drugs, which do inspire moral outrage because they really are harmful.) Perhaps what is needed is a law explicitly shielding schools from all liability for what students do or see on the Internet at school, if the faculty had no knowledge of it.
(Obligatory interstitial advertisement for common sense: I still don't see what the big deal is about porn anyway. Ask yourself: Why is it harmful to see a picture of a naked person, or even a picture of people having sex? And try to find an answer to that question that doesn't involve, "Lots of other people think so." That includes all variations like "Our society has determined...", "We as a people have decided...", which are just re-phrasings of "Lots of other people think so." I submit that if you disallow those variations of grownup-peer-pressure as an excuse, most people can't really come up with any reason at all.)
OK, flame-retardant suit off, lab coat back on. Previous reports have listed absurd examples of sites blocked by Bess, and looking at any one of those examples or the ones listed here, I'd say that in terms of public policy discussions -- specifically, whether a blocking software company should be trusted to decide what students can look at -- any one of these blocked sites would be more significant than, say, the blocking of BoingBoing which got so much attention. BoingBoing got blocked because of a non-sexual picture of a bare breast on the cover of one of the books they reviewed -- and in fact they were blocked only in the "nudity" category, which includes only "non-pornographic images of the bare human body". So the block on BoingBoing really only revealed that Secure Computing was a bit heavy-handed. (The real problem is that SmartFilter has the category for non-pornographic nudity blocked by default, even though the CIPA filtering law certainly doesn't require schools to block non-pornographic artistic images!) On the other hand, the fact that EFF Austin and the Rhode Island Coalition Against Domestic Violence are currently blocked as "Pornography", suggests that in many instances the blocking companies have nobody at the controls at all. To focus on stupid-but-not-completely-insane blocks like BoingBoing is letting them off easy.
So why did the laundry lists of blocked sites released over the years never become as widely known as BoingBoing, or the guffaw-inducing examples like "Beaver College", which had to change their name in part because of students reportedly being blocked from accessing their website? I think it's because the news favors a good "punch line" -- a fact that anybody can understand that makes us feel smarter than the computers making these dumb mistakes. "Oh, I get it, it was blocked because it was called Beaver College!" But the "punch line" anecdotes are precisely the ones that let the blocking companies off lightly, because it gives them a plausible-sounding excuse for making an error. On the other hand, when the Rhode Island Coalition Against Domestic Violence gets blocked as "Pornography", that could probably force the blocking company to answer some tough questions if it got more press, but there's no good punch line there, so the story just fizzles.
So, while I'm looking through the rest of the data, let me try and come up with some punch lines for reporters to make these blocked sites newsworthy. OK: Why was GardenMentor.com blocked? To keep kids away from all the dirty bitches and hoes! Get it? Ha ha! Why was the Catalina 380 yachting site blocked from kids? Because teens are too vulnerable to pier pressure! Hey, where are you going?
-
From Bess to Worse
Frequent Slashdot contributor Bennett Haselton writes " From about 1996 to 2003, there were regular reports listing examples of sites stupidly blocked by blocking software. The genre has tapered off recently, probably as a result of the Supreme Court ruling in 2003 that the Children's Internet Protection Act (CIPA) was constitutional, requiring blocking software in schools and libraries that receive federal funds, despite all the evidence of over-blocking presented at the trial. The last high-profile story about a site blocked by blocking software was about the blocking of BoingBoing almost a year ago. But the lack of recent reports on blocking software errors doesn't mean that the software has gotten better." The rest of his essay follows.One product that generated several reports over the years was "Bess, the Internet Retriever" from N2H2, which has since been bought out by Secure Computing, which also makes a blocking program called SmartFilter (the one that blocked BoingBoing) and now sells "SmartFilter, Bess Edition" which uses the same database as Bess. Different organizations and individuals published a series of investigative reports about Bess from 1997 until 2002, listing sites about gay rights, eating disorders, and other subjects that were blocked as "pornography". In Ben Edelman's supplemental report, submitted as testimony in the CIPA trial, he listed examples of erroneously blocked sites that he had reported to N2H2 in his first expert report, and which were still being blocked five months later.
Since Bess represents a set of data points showing how the accuracy of a blocking program can change, or not change, over the years, recently I began testing it again. I didn't know whether to expect it to be better or worse. On the one hand, advances in technology and greater revenue to censorware companies could have caused the software to improve. On the other hand, the number of Web pages, and the rate at which dynamic sites like blogs change content every day, has exploded. The result? I'm still tabulating data, but it looks as if the accuracy rate is roughly the same as it was in 2000, when about 30% of blocked sites were obvious errors. Then and now, I found most of the errors by starting with a large list of URLs culled from search engines and other sources, and simply running them through the software to see what was blocked.
Here is a partial list of some of the questionable categorizations made by Bess; as of this writing, all of the following sites are listed as "Pornography" when you look them up on Secure Computing's Bess lookup form. (This is not just a fluke of the lookup tool; I tested against a copy of the software that all of these sites really were blocked.) The "screen cap" link next to each site links to a snapshot of the results taken from the lookup form (you can check on http://database.n2h2.com/ to see if the page is still returning the same results, although the more obvious errors will probably be fixed after this article is published):
- The Electronic Frontier Foundation, Austin chapter (screen cap)
- Cretans of Houston (screen cap). That's Cretans, as in "people from the island of Crete". Not to be confused with the Cretins of Houston, located here.
- The Rhode Island Coalition Against Domestic Violence (screen cap)
- The website of the public art galleries of British Columbia, Canada (screen cap)
- Rail2000, now the Bay Rail Alliance, a consumer group lobbying for a San Francisco regional rail system (screen cap)
- Rainbow Service Organization, a gay rights advocacy group (screen cap)
- GardenMentors.com, a custom gardening services company in Seattle (screen cap)
- A web site for Catalina 380 series boats (screen cap)
- Open Source ERP, a site promoting open source software for enterprise resource planning and customer relationship management (screen cap)
- The Bryn Mawr Mainliners, a barbershop harmony group (screen cap)
- Timber Trails, an outdoor recreation site (screen cap)
- The MEFTA Institute: "Middle East Free Trade Areas for Business Peace" -- world peace through cheap oil! (screen cap)
- Topple Rummy, a (somewhat out-of-date) site calling for the ouster of Donald Rumsfeld (screen cap)
- The Alabama Network of Children's Advocacy Centers (screen cap)
- PSARA, a non-profit organization for training cruise travel agents (screen cap)
- Park Place Behavioral Health Care, a non-profit mental health care agency (screen cap)
- The Oklahoma chapter of the American Institute of Building Design (screen cap)
- The Boys & Girls Clubs of Metropolitan Phoenix (screen cap)
- CEMTACH -- Computational ElectroMagnetics Theory-Algorithm-Code-Hardware. "Our goal is to develop systems simulations capabilities based on time-domain computational electromagnetics methods." Thanks for clearing that up. (screen cap)
- Fund for Humanity, a San Francisco non-profit supporting environmental organizations and organizations that assist the poor. (screen cap)
A long-standing point of contention while earlier reports about Bess were coming out, was whether every site on their blacklist had been reviewed by a human before being blocked. In 1998 the CEO testified before Congress that "All sites that are blocked are reviewed by N2H2 staff before being added to the block lists." However in their 2002 annual report the company finally admitted that not all sites were reviewed before being blocked: "Through automated categorization or human review, Web sites are identified as fitting into one or more of our categories". At one point an N2H2 employee also told me that when one site is blocked, they will often block all sites hosted on that machine or at that IP -- which of course means that those sites are also not reviewed before being blocked. In any case, it's possible to access some of these sites by IP address, such as the BC Art Galleries site via this link, or the or the Rhode Island Coalition Against Domestic Violence via this link -- so if they're not sharing their IP with other sites, that wouldn't explain how they got blocked either. Smartfilter spokesperson Tomo Foote-Lennox said that one other blocked URL that I found, http://www.arbiol.org/, was the result of an experiment N2H2 once did with fully automated website ratings.
Foote-Lennox added, "In general, we find that schools are VERY sensitive to under-blocking. The would rather block a whole lot of useful reference sites to avoid exposing one porn site." Probably true, although keep in mind we're talking about liability issues, not actual moral outrage. (If they were really morally outraged, they'd be trying to keep kids away from uncensored Internet access everywhere, not just in school! That is in fact the approach that schools take with things like drugs, which do inspire moral outrage because they really are harmful.) Perhaps what is needed is a law explicitly shielding schools from all liability for what students do or see on the Internet at school, if the faculty had no knowledge of it.
(Obligatory interstitial advertisement for common sense: I still don't see what the big deal is about porn anyway. Ask yourself: Why is it harmful to see a picture of a naked person, or even a picture of people having sex? And try to find an answer to that question that doesn't involve, "Lots of other people think so." That includes all variations like "Our society has determined...", "We as a people have decided...", which are just re-phrasings of "Lots of other people think so." I submit that if you disallow those variations of grownup-peer-pressure as an excuse, most people can't really come up with any reason at all.)
OK, flame-retardant suit off, lab coat back on. Previous reports have listed absurd examples of sites blocked by Bess, and looking at any one of those examples or the ones listed here, I'd say that in terms of public policy discussions -- specifically, whether a blocking software company should be trusted to decide what students can look at -- any one of these blocked sites would be more significant than, say, the blocking of BoingBoing which got so much attention. BoingBoing got blocked because of a non-sexual picture of a bare breast on the cover of one of the books they reviewed -- and in fact they were blocked only in the "nudity" category, which includes only "non-pornographic images of the bare human body". So the block on BoingBoing really only revealed that Secure Computing was a bit heavy-handed. (The real problem is that SmartFilter has the category for non-pornographic nudity blocked by default, even though the CIPA filtering law certainly doesn't require schools to block non-pornographic artistic images!) On the other hand, the fact that EFF Austin and the Rhode Island Coalition Against Domestic Violence are currently blocked as "Pornography", suggests that in many instances the blocking companies have nobody at the controls at all. To focus on stupid-but-not-completely-insane blocks like BoingBoing is letting them off easy.
So why did the laundry lists of blocked sites released over the years never become as widely known as BoingBoing, or the guffaw-inducing examples like "Beaver College", which had to change their name in part because of students reportedly being blocked from accessing their website? I think it's because the news favors a good "punch line" -- a fact that anybody can understand that makes us feel smarter than the computers making these dumb mistakes. "Oh, I get it, it was blocked because it was called Beaver College!" But the "punch line" anecdotes are precisely the ones that let the blocking companies off lightly, because it gives them a plausible-sounding excuse for making an error. On the other hand, when the Rhode Island Coalition Against Domestic Violence gets blocked as "Pornography", that could probably force the blocking company to answer some tough questions if it got more press, but there's no good punch line there, so the story just fizzles.
So, while I'm looking through the rest of the data, let me try and come up with some punch lines for reporters to make these blocked sites newsworthy. OK: Why was GardenMentor.com blocked? To keep kids away from all the dirty bitches and hoes! Get it? Ha ha! Why was the Catalina 380 yachting site blocked from kids? Because teens are too vulnerable to pier pressure! Hey, where are you going?
-
From Bess to Worse
Frequent Slashdot contributor Bennett Haselton writes " From about 1996 to 2003, there were regular reports listing examples of sites stupidly blocked by blocking software. The genre has tapered off recently, probably as a result of the Supreme Court ruling in 2003 that the Children's Internet Protection Act (CIPA) was constitutional, requiring blocking software in schools and libraries that receive federal funds, despite all the evidence of over-blocking presented at the trial. The last high-profile story about a site blocked by blocking software was about the blocking of BoingBoing almost a year ago. But the lack of recent reports on blocking software errors doesn't mean that the software has gotten better." The rest of his essay follows.One product that generated several reports over the years was "Bess, the Internet Retriever" from N2H2, which has since been bought out by Secure Computing, which also makes a blocking program called SmartFilter (the one that blocked BoingBoing) and now sells "SmartFilter, Bess Edition" which uses the same database as Bess. Different organizations and individuals published a series of investigative reports about Bess from 1997 until 2002, listing sites about gay rights, eating disorders, and other subjects that were blocked as "pornography". In Ben Edelman's supplemental report, submitted as testimony in the CIPA trial, he listed examples of erroneously blocked sites that he had reported to N2H2 in his first expert report, and which were still being blocked five months later.
Since Bess represents a set of data points showing how the accuracy of a blocking program can change, or not change, over the years, recently I began testing it again. I didn't know whether to expect it to be better or worse. On the one hand, advances in technology and greater revenue to censorware companies could have caused the software to improve. On the other hand, the number of Web pages, and the rate at which dynamic sites like blogs change content every day, has exploded. The result? I'm still tabulating data, but it looks as if the accuracy rate is roughly the same as it was in 2000, when about 30% of blocked sites were obvious errors. Then and now, I found most of the errors by starting with a large list of URLs culled from search engines and other sources, and simply running them through the software to see what was blocked.
Here is a partial list of some of the questionable categorizations made by Bess; as of this writing, all of the following sites are listed as "Pornography" when you look them up on Secure Computing's Bess lookup form. (This is not just a fluke of the lookup tool; I tested against a copy of the software that all of these sites really were blocked.) The "screen cap" link next to each site links to a snapshot of the results taken from the lookup form (you can check on http://database.n2h2.com/ to see if the page is still returning the same results, although the more obvious errors will probably be fixed after this article is published):
- The Electronic Frontier Foundation, Austin chapter (screen cap)
- Cretans of Houston (screen cap). That's Cretans, as in "people from the island of Crete". Not to be confused with the Cretins of Houston, located here.
- The Rhode Island Coalition Against Domestic Violence (screen cap)
- The website of the public art galleries of British Columbia, Canada (screen cap)
- Rail2000, now the Bay Rail Alliance, a consumer group lobbying for a San Francisco regional rail system (screen cap)
- Rainbow Service Organization, a gay rights advocacy group (screen cap)
- GardenMentors.com, a custom gardening services company in Seattle (screen cap)
- A web site for Catalina 380 series boats (screen cap)
- Open Source ERP, a site promoting open source software for enterprise resource planning and customer relationship management (screen cap)
- The Bryn Mawr Mainliners, a barbershop harmony group (screen cap)
- Timber Trails, an outdoor recreation site (screen cap)
- The MEFTA Institute: "Middle East Free Trade Areas for Business Peace" -- world peace through cheap oil! (screen cap)
- Topple Rummy, a (somewhat out-of-date) site calling for the ouster of Donald Rumsfeld (screen cap)
- The Alabama Network of Children's Advocacy Centers (screen cap)
- PSARA, a non-profit organization for training cruise travel agents (screen cap)
- Park Place Behavioral Health Care, a non-profit mental health care agency (screen cap)
- The Oklahoma chapter of the American Institute of Building Design (screen cap)
- The Boys & Girls Clubs of Metropolitan Phoenix (screen cap)
- CEMTACH -- Computational ElectroMagnetics Theory-Algorithm-Code-Hardware. "Our goal is to develop systems simulations capabilities based on time-domain computational electromagnetics methods." Thanks for clearing that up. (screen cap)
- Fund for Humanity, a San Francisco non-profit supporting environmental organizations and organizations that assist the poor. (screen cap)
A long-standing point of contention while earlier reports about Bess were coming out, was whether every site on their blacklist had been reviewed by a human before being blocked. In 1998 the CEO testified before Congress that "All sites that are blocked are reviewed by N2H2 staff before being added to the block lists." However in their 2002 annual report the company finally admitted that not all sites were reviewed before being blocked: "Through automated categorization or human review, Web sites are identified as fitting into one or more of our categories". At one point an N2H2 employee also told me that when one site is blocked, they will often block all sites hosted on that machine or at that IP -- which of course means that those sites are also not reviewed before being blocked. In any case, it's possible to access some of these sites by IP address, such as the BC Art Galleries site via this link, or the or the Rhode Island Coalition Against Domestic Violence via this link -- so if they're not sharing their IP with other sites, that wouldn't explain how they got blocked either. Smartfilter spokesperson Tomo Foote-Lennox said that one other blocked URL that I found, http://www.arbiol.org/, was the result of an experiment N2H2 once did with fully automated website ratings.
Foote-Lennox added, "In general, we find that schools are VERY sensitive to under-blocking. The would rather block a whole lot of useful reference sites to avoid exposing one porn site." Probably true, although keep in mind we're talking about liability issues, not actual moral outrage. (If they were really morally outraged, they'd be trying to keep kids away from uncensored Internet access everywhere, not just in school! That is in fact the approach that schools take with things like drugs, which do inspire moral outrage because they really are harmful.) Perhaps what is needed is a law explicitly shielding schools from all liability for what students do or see on the Internet at school, if the faculty had no knowledge of it.
(Obligatory interstitial advertisement for common sense: I still don't see what the big deal is about porn anyway. Ask yourself: Why is it harmful to see a picture of a naked person, or even a picture of people having sex? And try to find an answer to that question that doesn't involve, "Lots of other people think so." That includes all variations like "Our society has determined...", "We as a people have decided...", which are just re-phrasings of "Lots of other people think so." I submit that if you disallow those variations of grownup-peer-pressure as an excuse, most people can't really come up with any reason at all.)
OK, flame-retardant suit off, lab coat back on. Previous reports have listed absurd examples of sites blocked by Bess, and looking at any one of those examples or the ones listed here, I'd say that in terms of public policy discussions -- specifically, whether a blocking software company should be trusted to decide what students can look at -- any one of these blocked sites would be more significant than, say, the blocking of BoingBoing which got so much attention. BoingBoing got blocked because of a non-sexual picture of a bare breast on the cover of one of the books they reviewed -- and in fact they were blocked only in the "nudity" category, which includes only "non-pornographic images of the bare human body". So the block on BoingBoing really only revealed that Secure Computing was a bit heavy-handed. (The real problem is that SmartFilter has the category for non-pornographic nudity blocked by default, even though the CIPA filtering law certainly doesn't require schools to block non-pornographic artistic images!) On the other hand, the fact that EFF Austin and the Rhode Island Coalition Against Domestic Violence are currently blocked as "Pornography", suggests that in many instances the blocking companies have nobody at the controls at all. To focus on stupid-but-not-completely-insane blocks like BoingBoing is letting them off easy.
So why did the laundry lists of blocked sites released over the years never become as widely known as BoingBoing, or the guffaw-inducing examples like "Beaver College", which had to change their name in part because of students reportedly being blocked from accessing their website? I think it's because the news favors a good "punch line" -- a fact that anybody can understand that makes us feel smarter than the computers making these dumb mistakes. "Oh, I get it, it was blocked because it was called Beaver College!" But the "punch line" anecdotes are precisely the ones that let the blocking companies off lightly, because it gives them a plausible-sounding excuse for making an error. On the other hand, when the Rhode Island Coalition Against Domestic Violence gets blocked as "Pornography", that could probably force the blocking company to answer some tough questions if it got more press, but there's no good punch line there, so the story just fizzles.
So, while I'm looking through the rest of the data, let me try and come up with some punch lines for reporters to make these blocked sites newsworthy. OK: Why was GardenMentor.com blocked? To keep kids away from all the dirty bitches and hoes! Get it? Ha ha! Why was the Catalina 380 yachting site blocked from kids? Because teens are too vulnerable to pier pressure! Hey, where are you going?
-
From Bess to Worse
Frequent Slashdot contributor Bennett Haselton writes " From about 1996 to 2003, there were regular reports listing examples of sites stupidly blocked by blocking software. The genre has tapered off recently, probably as a result of the Supreme Court ruling in 2003 that the Children's Internet Protection Act (CIPA) was constitutional, requiring blocking software in schools and libraries that receive federal funds, despite all the evidence of over-blocking presented at the trial. The last high-profile story about a site blocked by blocking software was about the blocking of BoingBoing almost a year ago. But the lack of recent reports on blocking software errors doesn't mean that the software has gotten better." The rest of his essay follows.One product that generated several reports over the years was "Bess, the Internet Retriever" from N2H2, which has since been bought out by Secure Computing, which also makes a blocking program called SmartFilter (the one that blocked BoingBoing) and now sells "SmartFilter, Bess Edition" which uses the same database as Bess. Different organizations and individuals published a series of investigative reports about Bess from 1997 until 2002, listing sites about gay rights, eating disorders, and other subjects that were blocked as "pornography". In Ben Edelman's supplemental report, submitted as testimony in the CIPA trial, he listed examples of erroneously blocked sites that he had reported to N2H2 in his first expert report, and which were still being blocked five months later.
Since Bess represents a set of data points showing how the accuracy of a blocking program can change, or not change, over the years, recently I began testing it again. I didn't know whether to expect it to be better or worse. On the one hand, advances in technology and greater revenue to censorware companies could have caused the software to improve. On the other hand, the number of Web pages, and the rate at which dynamic sites like blogs change content every day, has exploded. The result? I'm still tabulating data, but it looks as if the accuracy rate is roughly the same as it was in 2000, when about 30% of blocked sites were obvious errors. Then and now, I found most of the errors by starting with a large list of URLs culled from search engines and other sources, and simply running them through the software to see what was blocked.
Here is a partial list of some of the questionable categorizations made by Bess; as of this writing, all of the following sites are listed as "Pornography" when you look them up on Secure Computing's Bess lookup form. (This is not just a fluke of the lookup tool; I tested against a copy of the software that all of these sites really were blocked.) The "screen cap" link next to each site links to a snapshot of the results taken from the lookup form (you can check on http://database.n2h2.com/ to see if the page is still returning the same results, although the more obvious errors will probably be fixed after this article is published):
- The Electronic Frontier Foundation, Austin chapter (screen cap)
- Cretans of Houston (screen cap). That's Cretans, as in "people from the island of Crete". Not to be confused with the Cretins of Houston, located here.
- The Rhode Island Coalition Against Domestic Violence (screen cap)
- The website of the public art galleries of British Columbia, Canada (screen cap)
- Rail2000, now the Bay Rail Alliance, a consumer group lobbying for a San Francisco regional rail system (screen cap)
- Rainbow Service Organization, a gay rights advocacy group (screen cap)
- GardenMentors.com, a custom gardening services company in Seattle (screen cap)
- A web site for Catalina 380 series boats (screen cap)
- Open Source ERP, a site promoting open source software for enterprise resource planning and customer relationship management (screen cap)
- The Bryn Mawr Mainliners, a barbershop harmony group (screen cap)
- Timber Trails, an outdoor recreation site (screen cap)
- The MEFTA Institute: "Middle East Free Trade Areas for Business Peace" -- world peace through cheap oil! (screen cap)
- Topple Rummy, a (somewhat out-of-date) site calling for the ouster of Donald Rumsfeld (screen cap)
- The Alabama Network of Children's Advocacy Centers (screen cap)
- PSARA, a non-profit organization for training cruise travel agents (screen cap)
- Park Place Behavioral Health Care, a non-profit mental health care agency (screen cap)
- The Oklahoma chapter of the American Institute of Building Design (screen cap)
- The Boys & Girls Clubs of Metropolitan Phoenix (screen cap)
- CEMTACH -- Computational ElectroMagnetics Theory-Algorithm-Code-Hardware. "Our goal is to develop systems simulations capabilities based on time-domain computational electromagnetics methods." Thanks for clearing that up. (screen cap)
- Fund for Humanity, a San Francisco non-profit supporting environmental organizations and organizations that assist the poor. (screen cap)
A long-standing point of contention while earlier reports about Bess were coming out, was whether every site on their blacklist had been reviewed by a human before being blocked. In 1998 the CEO testified before Congress that "All sites that are blocked are reviewed by N2H2 staff before being added to the block lists." However in their 2002 annual report the company finally admitted that not all sites were reviewed before being blocked: "Through automated categorization or human review, Web sites are identified as fitting into one or more of our categories". At one point an N2H2 employee also told me that when one site is blocked, they will often block all sites hosted on that machine or at that IP -- which of course means that those sites are also not reviewed before being blocked. In any case, it's possible to access some of these sites by IP address, such as the BC Art Galleries site via this link, or the or the Rhode Island Coalition Against Domestic Violence via this link -- so if they're not sharing their IP with other sites, that wouldn't explain how they got blocked either. Smartfilter spokesperson Tomo Foote-Lennox said that one other blocked URL that I found, http://www.arbiol.org/, was the result of an experiment N2H2 once did with fully automated website ratings.
Foote-Lennox added, "In general, we find that schools are VERY sensitive to under-blocking. The would rather block a whole lot of useful reference sites to avoid exposing one porn site." Probably true, although keep in mind we're talking about liability issues, not actual moral outrage. (If they were really morally outraged, they'd be trying to keep kids away from uncensored Internet access everywhere, not just in school! That is in fact the approach that schools take with things like drugs, which do inspire moral outrage because they really are harmful.) Perhaps what is needed is a law explicitly shielding schools from all liability for what students do or see on the Internet at school, if the faculty had no knowledge of it.
(Obligatory interstitial advertisement for common sense: I still don't see what the big deal is about porn anyway. Ask yourself: Why is it harmful to see a picture of a naked person, or even a picture of people having sex? And try to find an answer to that question that doesn't involve, "Lots of other people think so." That includes all variations like "Our society has determined...", "We as a people have decided...", which are just re-phrasings of "Lots of other people think so." I submit that if you disallow those variations of grownup-peer-pressure as an excuse, most people can't really come up with any reason at all.)
OK, flame-retardant suit off, lab coat back on. Previous reports have listed absurd examples of sites blocked by Bess, and looking at any one of those examples or the ones listed here, I'd say that in terms of public policy discussions -- specifically, whether a blocking software company should be trusted to decide what students can look at -- any one of these blocked sites would be more significant than, say, the blocking of BoingBoing which got so much attention. BoingBoing got blocked because of a non-sexual picture of a bare breast on the cover of one of the books they reviewed -- and in fact they were blocked only in the "nudity" category, which includes only "non-pornographic images of the bare human body". So the block on BoingBoing really only revealed that Secure Computing was a bit heavy-handed. (The real problem is that SmartFilter has the category for non-pornographic nudity blocked by default, even though the CIPA filtering law certainly doesn't require schools to block non-pornographic artistic images!) On the other hand, the fact that EFF Austin and the Rhode Island Coalition Against Domestic Violence are currently blocked as "Pornography", suggests that in many instances the blocking companies have nobody at the controls at all. To focus on stupid-but-not-completely-insane blocks like BoingBoing is letting them off easy.
So why did the laundry lists of blocked sites released over the years never become as widely known as BoingBoing, or the guffaw-inducing examples like "Beaver College", which had to change their name in part because of students reportedly being blocked from accessing their website? I think it's because the news favors a good "punch line" -- a fact that anybody can understand that makes us feel smarter than the computers making these dumb mistakes. "Oh, I get it, it was blocked because it was called Beaver College!" But the "punch line" anecdotes are precisely the ones that let the blocking companies off lightly, because it gives them a plausible-sounding excuse for making an error. On the other hand, when the Rhode Island Coalition Against Domestic Violence gets blocked as "Pornography", that could probably force the blocking company to answer some tough questions if it got more press, but there's no good punch line there, so the story just fizzles.
So, while I'm looking through the rest of the data, let me try and come up with some punch lines for reporters to make these blocked sites newsworthy. OK: Why was GardenMentor.com blocked? To keep kids away from all the dirty bitches and hoes! Get it? Ha ha! Why was the Catalina 380 yachting site blocked from kids? Because teens are too vulnerable to pier pressure! Hey, where are you going?
-
From Bess to Worse
Frequent Slashdot contributor Bennett Haselton writes " From about 1996 to 2003, there were regular reports listing examples of sites stupidly blocked by blocking software. The genre has tapered off recently, probably as a result of the Supreme Court ruling in 2003 that the Children's Internet Protection Act (CIPA) was constitutional, requiring blocking software in schools and libraries that receive federal funds, despite all the evidence of over-blocking presented at the trial. The last high-profile story about a site blocked by blocking software was about the blocking of BoingBoing almost a year ago. But the lack of recent reports on blocking software errors doesn't mean that the software has gotten better." The rest of his essay follows.One product that generated several reports over the years was "Bess, the Internet Retriever" from N2H2, which has since been bought out by Secure Computing, which also makes a blocking program called SmartFilter (the one that blocked BoingBoing) and now sells "SmartFilter, Bess Edition" which uses the same database as Bess. Different organizations and individuals published a series of investigative reports about Bess from 1997 until 2002, listing sites about gay rights, eating disorders, and other subjects that were blocked as "pornography". In Ben Edelman's supplemental report, submitted as testimony in the CIPA trial, he listed examples of erroneously blocked sites that he had reported to N2H2 in his first expert report, and which were still being blocked five months later.
Since Bess represents a set of data points showing how the accuracy of a blocking program can change, or not change, over the years, recently I began testing it again. I didn't know whether to expect it to be better or worse. On the one hand, advances in technology and greater revenue to censorware companies could have caused the software to improve. On the other hand, the number of Web pages, and the rate at which dynamic sites like blogs change content every day, has exploded. The result? I'm still tabulating data, but it looks as if the accuracy rate is roughly the same as it was in 2000, when about 30% of blocked sites were obvious errors. Then and now, I found most of the errors by starting with a large list of URLs culled from search engines and other sources, and simply running them through the software to see what was blocked.
Here is a partial list of some of the questionable categorizations made by Bess; as of this writing, all of the following sites are listed as "Pornography" when you look them up on Secure Computing's Bess lookup form. (This is not just a fluke of the lookup tool; I tested against a copy of the software that all of these sites really were blocked.) The "screen cap" link next to each site links to a snapshot of the results taken from the lookup form (you can check on http://database.n2h2.com/ to see if the page is still returning the same results, although the more obvious errors will probably be fixed after this article is published):
- The Electronic Frontier Foundation, Austin chapter (screen cap)
- Cretans of Houston (screen cap). That's Cretans, as in "people from the island of Crete". Not to be confused with the Cretins of Houston, located here.
- The Rhode Island Coalition Against Domestic Violence (screen cap)
- The website of the public art galleries of British Columbia, Canada (screen cap)
- Rail2000, now the Bay Rail Alliance, a consumer group lobbying for a San Francisco regional rail system (screen cap)
- Rainbow Service Organization, a gay rights advocacy group (screen cap)
- GardenMentors.com, a custom gardening services company in Seattle (screen cap)
- A web site for Catalina 380 series boats (screen cap)
- Open Source ERP, a site promoting open source software for enterprise resource planning and customer relationship management (screen cap)
- The Bryn Mawr Mainliners, a barbershop harmony group (screen cap)
- Timber Trails, an outdoor recreation site (screen cap)
- The MEFTA Institute: "Middle East Free Trade Areas for Business Peace" -- world peace through cheap oil! (screen cap)
- Topple Rummy, a (somewhat out-of-date) site calling for the ouster of Donald Rumsfeld (screen cap)
- The Alabama Network of Children's Advocacy Centers (screen cap)
- PSARA, a non-profit organization for training cruise travel agents (screen cap)
- Park Place Behavioral Health Care, a non-profit mental health care agency (screen cap)
- The Oklahoma chapter of the American Institute of Building Design (screen cap)
- The Boys & Girls Clubs of Metropolitan Phoenix (screen cap)
- CEMTACH -- Computational ElectroMagnetics Theory-Algorithm-Code-Hardware. "Our goal is to develop systems simulations capabilities based on time-domain computational electromagnetics methods." Thanks for clearing that up. (screen cap)
- Fund for Humanity, a San Francisco non-profit supporting environmental organizations and organizations that assist the poor. (screen cap)
A long-standing point of contention while earlier reports about Bess were coming out, was whether every site on their blacklist had been reviewed by a human before being blocked. In 1998 the CEO testified before Congress that "All sites that are blocked are reviewed by N2H2 staff before being added to the block lists." However in their 2002 annual report the company finally admitted that not all sites were reviewed before being blocked: "Through automated categorization or human review, Web sites are identified as fitting into one or more of our categories". At one point an N2H2 employee also told me that when one site is blocked, they will often block all sites hosted on that machine or at that IP -- which of course means that those sites are also not reviewed before being blocked. In any case, it's possible to access some of these sites by IP address, such as the BC Art Galleries site via this link, or the or the Rhode Island Coalition Against Domestic Violence via this link -- so if they're not sharing their IP with other sites, that wouldn't explain how they got blocked either. Smartfilter spokesperson Tomo Foote-Lennox said that one other blocked URL that I found, http://www.arbiol.org/, was the result of an experiment N2H2 once did with fully automated website ratings.
Foote-Lennox added, "In general, we find that schools are VERY sensitive to under-blocking. The would rather block a whole lot of useful reference sites to avoid exposing one porn site." Probably true, although keep in mind we're talking about liability issues, not actual moral outrage. (If they were really morally outraged, they'd be trying to keep kids away from uncensored Internet access everywhere, not just in school! That is in fact the approach that schools take with things like drugs, which do inspire moral outrage because they really are harmful.) Perhaps what is needed is a law explicitly shielding schools from all liability for what students do or see on the Internet at school, if the faculty had no knowledge of it.
(Obligatory interstitial advertisement for common sense: I still don't see what the big deal is about porn anyway. Ask yourself: Why is it harmful to see a picture of a naked person, or even a picture of people having sex? And try to find an answer to that question that doesn't involve, "Lots of other people think so." That includes all variations like "Our society has determined...", "We as a people have decided...", which are just re-phrasings of "Lots of other people think so." I submit that if you disallow those variations of grownup-peer-pressure as an excuse, most people can't really come up with any reason at all.)
OK, flame-retardant suit off, lab coat back on. Previous reports have listed absurd examples of sites blocked by Bess, and looking at any one of those examples or the ones listed here, I'd say that in terms of public policy discussions -- specifically, whether a blocking software company should be trusted to decide what students can look at -- any one of these blocked sites would be more significant than, say, the blocking of BoingBoing which got so much attention. BoingBoing got blocked because of a non-sexual picture of a bare breast on the cover of one of the books they reviewed -- and in fact they were blocked only in the "nudity" category, which includes only "non-pornographic images of the bare human body". So the block on BoingBoing really only revealed that Secure Computing was a bit heavy-handed. (The real problem is that SmartFilter has the category for non-pornographic nudity blocked by default, even though the CIPA filtering law certainly doesn't require schools to block non-pornographic artistic images!) On the other hand, the fact that EFF Austin and the Rhode Island Coalition Against Domestic Violence are currently blocked as "Pornography", suggests that in many instances the blocking companies have nobody at the controls at all. To focus on stupid-but-not-completely-insane blocks like BoingBoing is letting them off easy.
So why did the laundry lists of blocked sites released over the years never become as widely known as BoingBoing, or the guffaw-inducing examples like "Beaver College", which had to change their name in part because of students reportedly being blocked from accessing their website? I think it's because the news favors a good "punch line" -- a fact that anybody can understand that makes us feel smarter than the computers making these dumb mistakes. "Oh, I get it, it was blocked because it was called Beaver College!" But the "punch line" anecdotes are precisely the ones that let the blocking companies off lightly, because it gives them a plausible-sounding excuse for making an error. On the other hand, when the Rhode Island Coalition Against Domestic Violence gets blocked as "Pornography", that could probably force the blocking company to answer some tough questions if it got more press, but there's no good punch line there, so the story just fizzles.
So, while I'm looking through the rest of the data, let me try and come up with some punch lines for reporters to make these blocked sites newsworthy. OK: Why was GardenMentor.com blocked? To keep kids away from all the dirty bitches and hoes! Get it? Ha ha! Why was the Catalina 380 yachting site blocked from kids? Because teens are too vulnerable to pier pressure! Hey, where are you going?
-
From Bess to Worse
Frequent Slashdot contributor Bennett Haselton writes " From about 1996 to 2003, there were regular reports listing examples of sites stupidly blocked by blocking software. The genre has tapered off recently, probably as a result of the Supreme Court ruling in 2003 that the Children's Internet Protection Act (CIPA) was constitutional, requiring blocking software in schools and libraries that receive federal funds, despite all the evidence of over-blocking presented at the trial. The last high-profile story about a site blocked by blocking software was about the blocking of BoingBoing almost a year ago. But the lack of recent reports on blocking software errors doesn't mean that the software has gotten better." The rest of his essay follows.One product that generated several reports over the years was "Bess, the Internet Retriever" from N2H2, which has since been bought out by Secure Computing, which also makes a blocking program called SmartFilter (the one that blocked BoingBoing) and now sells "SmartFilter, Bess Edition" which uses the same database as Bess. Different organizations and individuals published a series of investigative reports about Bess from 1997 until 2002, listing sites about gay rights, eating disorders, and other subjects that were blocked as "pornography". In Ben Edelman's supplemental report, submitted as testimony in the CIPA trial, he listed examples of erroneously blocked sites that he had reported to N2H2 in his first expert report, and which were still being blocked five months later.
Since Bess represents a set of data points showing how the accuracy of a blocking program can change, or not change, over the years, recently I began testing it again. I didn't know whether to expect it to be better or worse. On the one hand, advances in technology and greater revenue to censorware companies could have caused the software to improve. On the other hand, the number of Web pages, and the rate at which dynamic sites like blogs change content every day, has exploded. The result? I'm still tabulating data, but it looks as if the accuracy rate is roughly the same as it was in 2000, when about 30% of blocked sites were obvious errors. Then and now, I found most of the errors by starting with a large list of URLs culled from search engines and other sources, and simply running them through the software to see what was blocked.
Here is a partial list of some of the questionable categorizations made by Bess; as of this writing, all of the following sites are listed as "Pornography" when you look them up on Secure Computing's Bess lookup form. (This is not just a fluke of the lookup tool; I tested against a copy of the software that all of these sites really were blocked.) The "screen cap" link next to each site links to a snapshot of the results taken from the lookup form (you can check on http://database.n2h2.com/ to see if the page is still returning the same results, although the more obvious errors will probably be fixed after this article is published):
- The Electronic Frontier Foundation, Austin chapter (screen cap)
- Cretans of Houston (screen cap). That's Cretans, as in "people from the island of Crete". Not to be confused with the Cretins of Houston, located here.
- The Rhode Island Coalition Against Domestic Violence (screen cap)
- The website of the public art galleries of British Columbia, Canada (screen cap)
- Rail2000, now the Bay Rail Alliance, a consumer group lobbying for a San Francisco regional rail system (screen cap)
- Rainbow Service Organization, a gay rights advocacy group (screen cap)
- GardenMentors.com, a custom gardening services company in Seattle (screen cap)
- A web site for Catalina 380 series boats (screen cap)
- Open Source ERP, a site promoting open source software for enterprise resource planning and customer relationship management (screen cap)
- The Bryn Mawr Mainliners, a barbershop harmony group (screen cap)
- Timber Trails, an outdoor recreation site (screen cap)
- The MEFTA Institute: "Middle East Free Trade Areas for Business Peace" -- world peace through cheap oil! (screen cap)
- Topple Rummy, a (somewhat out-of-date) site calling for the ouster of Donald Rumsfeld (screen cap)
- The Alabama Network of Children's Advocacy Centers (screen cap)
- PSARA, a non-profit organization for training cruise travel agents (screen cap)
- Park Place Behavioral Health Care, a non-profit mental health care agency (screen cap)
- The Oklahoma chapter of the American Institute of Building Design (screen cap)
- The Boys & Girls Clubs of Metropolitan Phoenix (screen cap)
- CEMTACH -- Computational ElectroMagnetics Theory-Algorithm-Code-Hardware. "Our goal is to develop systems simulations capabilities based on time-domain computational electromagnetics methods." Thanks for clearing that up. (screen cap)
- Fund for Humanity, a San Francisco non-profit supporting environmental organizations and organizations that assist the poor. (screen cap)
A long-standing point of contention while earlier reports about Bess were coming out, was whether every site on their blacklist had been reviewed by a human before being blocked. In 1998 the CEO testified before Congress that "All sites that are blocked are reviewed by N2H2 staff before being added to the block lists." However in their 2002 annual report the company finally admitted that not all sites were reviewed before being blocked: "Through automated categorization or human review, Web sites are identified as fitting into one or more of our categories". At one point an N2H2 employee also told me that when one site is blocked, they will often block all sites hosted on that machine or at that IP -- which of course means that those sites are also not reviewed before being blocked. In any case, it's possible to access some of these sites by IP address, such as the BC Art Galleries site via this link, or the or the Rhode Island Coalition Against Domestic Violence via this link -- so if they're not sharing their IP with other sites, that wouldn't explain how they got blocked either. Smartfilter spokesperson Tomo Foote-Lennox said that one other blocked URL that I found, http://www.arbiol.org/, was the result of an experiment N2H2 once did with fully automated website ratings.
Foote-Lennox added, "In general, we find that schools are VERY sensitive to under-blocking. The would rather block a whole lot of useful reference sites to avoid exposing one porn site." Probably true, although keep in mind we're talking about liability issues, not actual moral outrage. (If they were really morally outraged, they'd be trying to keep kids away from uncensored Internet access everywhere, not just in school! That is in fact the approach that schools take with things like drugs, which do inspire moral outrage because they really are harmful.) Perhaps what is needed is a law explicitly shielding schools from all liability for what students do or see on the Internet at school, if the faculty had no knowledge of it.
(Obligatory interstitial advertisement for common sense: I still don't see what the big deal is about porn anyway. Ask yourself: Why is it harmful to see a picture of a naked person, or even a picture of people having sex? And try to find an answer to that question that doesn't involve, "Lots of other people think so." That includes all variations like "Our society has determined...", "We as a people have decided...", which are just re-phrasings of "Lots of other people think so." I submit that if you disallow those variations of grownup-peer-pressure as an excuse, most people can't really come up with any reason at all.)
OK, flame-retardant suit off, lab coat back on. Previous reports have listed absurd examples of sites blocked by Bess, and looking at any one of those examples or the ones listed here, I'd say that in terms of public policy discussions -- specifically, whether a blocking software company should be trusted to decide what students can look at -- any one of these blocked sites would be more significant than, say, the blocking of BoingBoing which got so much attention. BoingBoing got blocked because of a non-sexual picture of a bare breast on the cover of one of the books they reviewed -- and in fact they were blocked only in the "nudity" category, which includes only "non-pornographic images of the bare human body". So the block on BoingBoing really only revealed that Secure Computing was a bit heavy-handed. (The real problem is that SmartFilter has the category for non-pornographic nudity blocked by default, even though the CIPA filtering law certainly doesn't require schools to block non-pornographic artistic images!) On the other hand, the fact that EFF Austin and the Rhode Island Coalition Against Domestic Violence are currently blocked as "Pornography", suggests that in many instances the blocking companies have nobody at the controls at all. To focus on stupid-but-not-completely-insane blocks like BoingBoing is letting them off easy.
So why did the laundry lists of blocked sites released over the years never become as widely known as BoingBoing, or the guffaw-inducing examples like "Beaver College", which had to change their name in part because of students reportedly being blocked from accessing their website? I think it's because the news favors a good "punch line" -- a fact that anybody can understand that makes us feel smarter than the computers making these dumb mistakes. "Oh, I get it, it was blocked because it was called Beaver College!" But the "punch line" anecdotes are precisely the ones that let the blocking companies off lightly, because it gives them a plausible-sounding excuse for making an error. On the other hand, when the Rhode Island Coalition Against Domestic Violence gets blocked as "Pornography", that could probably force the blocking company to answer some tough questions if it got more press, but there's no good punch line there, so the story just fizzles.
So, while I'm looking through the rest of the data, let me try and come up with some punch lines for reporters to make these blocked sites newsworthy. OK: Why was GardenMentor.com blocked? To keep kids away from all the dirty bitches and hoes! Get it? Ha ha! Why was the Catalina 380 yachting site blocked from kids? Because teens are too vulnerable to pier pressure! Hey, where are you going?
-
From Bess to Worse
Frequent Slashdot contributor Bennett Haselton writes " From about 1996 to 2003, there were regular reports listing examples of sites stupidly blocked by blocking software. The genre has tapered off recently, probably as a result of the Supreme Court ruling in 2003 that the Children's Internet Protection Act (CIPA) was constitutional, requiring blocking software in schools and libraries that receive federal funds, despite all the evidence of over-blocking presented at the trial. The last high-profile story about a site blocked by blocking software was about the blocking of BoingBoing almost a year ago. But the lack of recent reports on blocking software errors doesn't mean that the software has gotten better." The rest of his essay follows.One product that generated several reports over the years was "Bess, the Internet Retriever" from N2H2, which has since been bought out by Secure Computing, which also makes a blocking program called SmartFilter (the one that blocked BoingBoing) and now sells "SmartFilter, Bess Edition" which uses the same database as Bess. Different organizations and individuals published a series of investigative reports about Bess from 1997 until 2002, listing sites about gay rights, eating disorders, and other subjects that were blocked as "pornography". In Ben Edelman's supplemental report, submitted as testimony in the CIPA trial, he listed examples of erroneously blocked sites that he had reported to N2H2 in his first expert report, and which were still being blocked five months later.
Since Bess represents a set of data points showing how the accuracy of a blocking program can change, or not change, over the years, recently I began testing it again. I didn't know whether to expect it to be better or worse. On the one hand, advances in technology and greater revenue to censorware companies could have caused the software to improve. On the other hand, the number of Web pages, and the rate at which dynamic sites like blogs change content every day, has exploded. The result? I'm still tabulating data, but it looks as if the accuracy rate is roughly the same as it was in 2000, when about 30% of blocked sites were obvious errors. Then and now, I found most of the errors by starting with a large list of URLs culled from search engines and other sources, and simply running them through the software to see what was blocked.
Here is a partial list of some of the questionable categorizations made by Bess; as of this writing, all of the following sites are listed as "Pornography" when you look them up on Secure Computing's Bess lookup form. (This is not just a fluke of the lookup tool; I tested against a copy of the software that all of these sites really were blocked.) The "screen cap" link next to each site links to a snapshot of the results taken from the lookup form (you can check on http://database.n2h2.com/ to see if the page is still returning the same results, although the more obvious errors will probably be fixed after this article is published):
- The Electronic Frontier Foundation, Austin chapter (screen cap)
- Cretans of Houston (screen cap). That's Cretans, as in "people from the island of Crete". Not to be confused with the Cretins of Houston, located here.
- The Rhode Island Coalition Against Domestic Violence (screen cap)
- The website of the public art galleries of British Columbia, Canada (screen cap)
- Rail2000, now the Bay Rail Alliance, a consumer group lobbying for a San Francisco regional rail system (screen cap)
- Rainbow Service Organization, a gay rights advocacy group (screen cap)
- GardenMentors.com, a custom gardening services company in Seattle (screen cap)
- A web site for Catalina 380 series boats (screen cap)
- Open Source ERP, a site promoting open source software for enterprise resource planning and customer relationship management (screen cap)
- The Bryn Mawr Mainliners, a barbershop harmony group (screen cap)
- Timber Trails, an outdoor recreation site (screen cap)
- The MEFTA Institute: "Middle East Free Trade Areas for Business Peace" -- world peace through cheap oil! (screen cap)
- Topple Rummy, a (somewhat out-of-date) site calling for the ouster of Donald Rumsfeld (screen cap)
- The Alabama Network of Children's Advocacy Centers (screen cap)
- PSARA, a non-profit organization for training cruise travel agents (screen cap)
- Park Place Behavioral Health Care, a non-profit mental health care agency (screen cap)
- The Oklahoma chapter of the American Institute of Building Design (screen cap)
- The Boys & Girls Clubs of Metropolitan Phoenix (screen cap)
- CEMTACH -- Computational ElectroMagnetics Theory-Algorithm-Code-Hardware. "Our goal is to develop systems simulations capabilities based on time-domain computational electromagnetics methods." Thanks for clearing that up. (screen cap)
- Fund for Humanity, a San Francisco non-profit supporting environmental organizations and organizations that assist the poor. (screen cap)
A long-standing point of contention while earlier reports about Bess were coming out, was whether every site on their blacklist had been reviewed by a human before being blocked. In 1998 the CEO testified before Congress that "All sites that are blocked are reviewed by N2H2 staff before being added to the block lists." However in their 2002 annual report the company finally admitted that not all sites were reviewed before being blocked: "Through automated categorization or human review, Web sites are identified as fitting into one or more of our categories". At one point an N2H2 employee also told me that when one site is blocked, they will often block all sites hosted on that machine or at that IP -- which of course means that those sites are also not reviewed before being blocked. In any case, it's possible to access some of these sites by IP address, such as the BC Art Galleries site via this link, or the or the Rhode Island Coalition Against Domestic Violence via this link -- so if they're not sharing their IP with other sites, that wouldn't explain how they got blocked either. Smartfilter spokesperson Tomo Foote-Lennox said that one other blocked URL that I found, http://www.arbiol.org/, was the result of an experiment N2H2 once did with fully automated website ratings.
Foote-Lennox added, "In general, we find that schools are VERY sensitive to under-blocking. The would rather block a whole lot of useful reference sites to avoid exposing one porn site." Probably true, although keep in mind we're talking about liability issues, not actual moral outrage. (If they were really morally outraged, they'd be trying to keep kids away from uncensored Internet access everywhere, not just in school! That is in fact the approach that schools take with things like drugs, which do inspire moral outrage because they really are harmful.) Perhaps what is needed is a law explicitly shielding schools from all liability for what students do or see on the Internet at school, if the faculty had no knowledge of it.
(Obligatory interstitial advertisement for common sense: I still don't see what the big deal is about porn anyway. Ask yourself: Why is it harmful to see a picture of a naked person, or even a picture of people having sex? And try to find an answer to that question that doesn't involve, "Lots of other people think so." That includes all variations like "Our society has determined...", "We as a people have decided...", which are just re-phrasings of "Lots of other people think so." I submit that if you disallow those variations of grownup-peer-pressure as an excuse, most people can't really come up with any reason at all.)
OK, flame-retardant suit off, lab coat back on. Previous reports have listed absurd examples of sites blocked by Bess, and looking at any one of those examples or the ones listed here, I'd say that in terms of public policy discussions -- specifically, whether a blocking software company should be trusted to decide what students can look at -- any one of these blocked sites would be more significant than, say, the blocking of BoingBoing which got so much attention. BoingBoing got blocked because of a non-sexual picture of a bare breast on the cover of one of the books they reviewed -- and in fact they were blocked only in the "nudity" category, which includes only "non-pornographic images of the bare human body". So the block on BoingBoing really only revealed that Secure Computing was a bit heavy-handed. (The real problem is that SmartFilter has the category for non-pornographic nudity blocked by default, even though the CIPA filtering law certainly doesn't require schools to block non-pornographic artistic images!) On the other hand, the fact that EFF Austin and the Rhode Island Coalition Against Domestic Violence are currently blocked as "Pornography", suggests that in many instances the blocking companies have nobody at the controls at all. To focus on stupid-but-not-completely-insane blocks like BoingBoing is letting them off easy.
So why did the laundry lists of blocked sites released over the years never become as widely known as BoingBoing, or the guffaw-inducing examples like "Beaver College", which had to change their name in part because of students reportedly being blocked from accessing their website? I think it's because the news favors a good "punch line" -- a fact that anybody can understand that makes us feel smarter than the computers making these dumb mistakes. "Oh, I get it, it was blocked because it was called Beaver College!" But the "punch line" anecdotes are precisely the ones that let the blocking companies off lightly, because it gives them a plausible-sounding excuse for making an error. On the other hand, when the Rhode Island Coalition Against Domestic Violence gets blocked as "Pornography", that could probably force the blocking company to answer some tough questions if it got more press, but there's no good punch line there, so the story just fizzles.
So, while I'm looking through the rest of the data, let me try and come up with some punch lines for reporters to make these blocked sites newsworthy. OK: Why was GardenMentor.com blocked? To keep kids away from all the dirty bitches and hoes! Get it? Ha ha! Why was the Catalina 380 yachting site blocked from kids? Because teens are too vulnerable to pier pressure! Hey, where are you going?
-
From Bess to Worse
Frequent Slashdot contributor Bennett Haselton writes " From about 1996 to 2003, there were regular reports listing examples of sites stupidly blocked by blocking software. The genre has tapered off recently, probably as a result of the Supreme Court ruling in 2003 that the Children's Internet Protection Act (CIPA) was constitutional, requiring blocking software in schools and libraries that receive federal funds, despite all the evidence of over-blocking presented at the trial. The last high-profile story about a site blocked by blocking software was about the blocking of BoingBoing almost a year ago. But the lack of recent reports on blocking software errors doesn't mean that the software has gotten better." The rest of his essay follows.One product that generated several reports over the years was "Bess, the Internet Retriever" from N2H2, which has since been bought out by Secure Computing, which also makes a blocking program called SmartFilter (the one that blocked BoingBoing) and now sells "SmartFilter, Bess Edition" which uses the same database as Bess. Different organizations and individuals published a series of investigative reports about Bess from 1997 until 2002, listing sites about gay rights, eating disorders, and other subjects that were blocked as "pornography". In Ben Edelman's supplemental report, submitted as testimony in the CIPA trial, he listed examples of erroneously blocked sites that he had reported to N2H2 in his first expert report, and which were still being blocked five months later.
Since Bess represents a set of data points showing how the accuracy of a blocking program can change, or not change, over the years, recently I began testing it again. I didn't know whether to expect it to be better or worse. On the one hand, advances in technology and greater revenue to censorware companies could have caused the software to improve. On the other hand, the number of Web pages, and the rate at which dynamic sites like blogs change content every day, has exploded. The result? I'm still tabulating data, but it looks as if the accuracy rate is roughly the same as it was in 2000, when about 30% of blocked sites were obvious errors. Then and now, I found most of the errors by starting with a large list of URLs culled from search engines and other sources, and simply running them through the software to see what was blocked.
Here is a partial list of some of the questionable categorizations made by Bess; as of this writing, all of the following sites are listed as "Pornography" when you look them up on Secure Computing's Bess lookup form. (This is not just a fluke of the lookup tool; I tested against a copy of the software that all of these sites really were blocked.) The "screen cap" link next to each site links to a snapshot of the results taken from the lookup form (you can check on http://database.n2h2.com/ to see if the page is still returning the same results, although the more obvious errors will probably be fixed after this article is published):
- The Electronic Frontier Foundation, Austin chapter (screen cap)
- Cretans of Houston (screen cap). That's Cretans, as in "people from the island of Crete". Not to be confused with the Cretins of Houston, located here.
- The Rhode Island Coalition Against Domestic Violence (screen cap)
- The website of the public art galleries of British Columbia, Canada (screen cap)
- Rail2000, now the Bay Rail Alliance, a consumer group lobbying for a San Francisco regional rail system (screen cap)
- Rainbow Service Organization, a gay rights advocacy group (screen cap)
- GardenMentors.com, a custom gardening services company in Seattle (screen cap)
- A web site for Catalina 380 series boats (screen cap)
- Open Source ERP, a site promoting open source software for enterprise resource planning and customer relationship management (screen cap)
- The Bryn Mawr Mainliners, a barbershop harmony group (screen cap)
- Timber Trails, an outdoor recreation site (screen cap)
- The MEFTA Institute: "Middle East Free Trade Areas for Business Peace" -- world peace through cheap oil! (screen cap)
- Topple Rummy, a (somewhat out-of-date) site calling for the ouster of Donald Rumsfeld (screen cap)
- The Alabama Network of Children's Advocacy Centers (screen cap)
- PSARA, a non-profit organization for training cruise travel agents (screen cap)
- Park Place Behavioral Health Care, a non-profit mental health care agency (screen cap)
- The Oklahoma chapter of the American Institute of Building Design (screen cap)
- The Boys & Girls Clubs of Metropolitan Phoenix (screen cap)
- CEMTACH -- Computational ElectroMagnetics Theory-Algorithm-Code-Hardware. "Our goal is to develop systems simulations capabilities based on time-domain computational electromagnetics methods." Thanks for clearing that up. (screen cap)
- Fund for Humanity, a San Francisco non-profit supporting environmental organizations and organizations that assist the poor. (screen cap)
A long-standing point of contention while earlier reports about Bess were coming out, was whether every site on their blacklist had been reviewed by a human before being blocked. In 1998 the CEO testified before Congress that "All sites that are blocked are reviewed by N2H2 staff before being added to the block lists." However in their 2002 annual report the company finally admitted that not all sites were reviewed before being blocked: "Through automated categorization or human review, Web sites are identified as fitting into one or more of our categories". At one point an N2H2 employee also told me that when one site is blocked, they will often block all sites hosted on that machine or at that IP -- which of course means that those sites are also not reviewed before being blocked. In any case, it's possible to access some of these sites by IP address, such as the BC Art Galleries site via this link, or the or the Rhode Island Coalition Against Domestic Violence via this link -- so if they're not sharing their IP with other sites, that wouldn't explain how they got blocked either. Smartfilter spokesperson Tomo Foote-Lennox said that one other blocked URL that I found, http://www.arbiol.org/, was the result of an experiment N2H2 once did with fully automated website ratings.
Foote-Lennox added, "In general, we find that schools are VERY sensitive to under-blocking. The would rather block a whole lot of useful reference sites to avoid exposing one porn site." Probably true, although keep in mind we're talking about liability issues, not actual moral outrage. (If they were really morally outraged, they'd be trying to keep kids away from uncensored Internet access everywhere, not just in school! That is in fact the approach that schools take with things like drugs, which do inspire moral outrage because they really are harmful.) Perhaps what is needed is a law explicitly shielding schools from all liability for what students do or see on the Internet at school, if the faculty had no knowledge of it.
(Obligatory interstitial advertisement for common sense: I still don't see what the big deal is about porn anyway. Ask yourself: Why is it harmful to see a picture of a naked person, or even a picture of people having sex? And try to find an answer to that question that doesn't involve, "Lots of other people think so." That includes all variations like "Our society has determined...", "We as a people have decided...", which are just re-phrasings of "Lots of other people think so." I submit that if you disallow those variations of grownup-peer-pressure as an excuse, most people can't really come up with any reason at all.)
OK, flame-retardant suit off, lab coat back on. Previous reports have listed absurd examples of sites blocked by Bess, and looking at any one of those examples or the ones listed here, I'd say that in terms of public policy discussions -- specifically, whether a blocking software company should be trusted to decide what students can look at -- any one of these blocked sites would be more significant than, say, the blocking of BoingBoing which got so much attention. BoingBoing got blocked because of a non-sexual picture of a bare breast on the cover of one of the books they reviewed -- and in fact they were blocked only in the "nudity" category, which includes only "non-pornographic images of the bare human body". So the block on BoingBoing really only revealed that Secure Computing was a bit heavy-handed. (The real problem is that SmartFilter has the category for non-pornographic nudity blocked by default, even though the CIPA filtering law certainly doesn't require schools to block non-pornographic artistic images!) On the other hand, the fact that EFF Austin and the Rhode Island Coalition Against Domestic Violence are currently blocked as "Pornography", suggests that in many instances the blocking companies have nobody at the controls at all. To focus on stupid-but-not-completely-insane blocks like BoingBoing is letting them off easy.
So why did the laundry lists of blocked sites released over the years never become as widely known as BoingBoing, or the guffaw-inducing examples like "Beaver College", which had to change their name in part because of students reportedly being blocked from accessing their website? I think it's because the news favors a good "punch line" -- a fact that anybody can understand that makes us feel smarter than the computers making these dumb mistakes. "Oh, I get it, it was blocked because it was called Beaver College!" But the "punch line" anecdotes are precisely the ones that let the blocking companies off lightly, because it gives them a plausible-sounding excuse for making an error. On the other hand, when the Rhode Island Coalition Against Domestic Violence gets blocked as "Pornography", that could probably force the blocking company to answer some tough questions if it got more press, but there's no good punch line there, so the story just fizzles.
So, while I'm looking through the rest of the data, let me try and come up with some punch lines for reporters to make these blocked sites newsworthy. OK: Why was GardenMentor.com blocked? To keep kids away from all the dirty bitches and hoes! Get it? Ha ha! Why was the Catalina 380 yachting site blocked from kids? Because teens are too vulnerable to pier pressure! Hey, where are you going?
-
From Bess to Worse
Frequent Slashdot contributor Bennett Haselton writes " From about 1996 to 2003, there were regular reports listing examples of sites stupidly blocked by blocking software. The genre has tapered off recently, probably as a result of the Supreme Court ruling in 2003 that the Children's Internet Protection Act (CIPA) was constitutional, requiring blocking software in schools and libraries that receive federal funds, despite all the evidence of over-blocking presented at the trial. The last high-profile story about a site blocked by blocking software was about the blocking of BoingBoing almost a year ago. But the lack of recent reports on blocking software errors doesn't mean that the software has gotten better." The rest of his essay follows.One product that generated several reports over the years was "Bess, the Internet Retriever" from N2H2, which has since been bought out by Secure Computing, which also makes a blocking program called SmartFilter (the one that blocked BoingBoing) and now sells "SmartFilter, Bess Edition" which uses the same database as Bess. Different organizations and individuals published a series of investigative reports about Bess from 1997 until 2002, listing sites about gay rights, eating disorders, and other subjects that were blocked as "pornography". In Ben Edelman's supplemental report, submitted as testimony in the CIPA trial, he listed examples of erroneously blocked sites that he had reported to N2H2 in his first expert report, and which were still being blocked five months later.
Since Bess represents a set of data points showing how the accuracy of a blocking program can change, or not change, over the years, recently I began testing it again. I didn't know whether to expect it to be better or worse. On the one hand, advances in technology and greater revenue to censorware companies could have caused the software to improve. On the other hand, the number of Web pages, and the rate at which dynamic sites like blogs change content every day, has exploded. The result? I'm still tabulating data, but it looks as if the accuracy rate is roughly the same as it was in 2000, when about 30% of blocked sites were obvious errors. Then and now, I found most of the errors by starting with a large list of URLs culled from search engines and other sources, and simply running them through the software to see what was blocked.
Here is a partial list of some of the questionable categorizations made by Bess; as of this writing, all of the following sites are listed as "Pornography" when you look them up on Secure Computing's Bess lookup form. (This is not just a fluke of the lookup tool; I tested against a copy of the software that all of these sites really were blocked.) The "screen cap" link next to each site links to a snapshot of the results taken from the lookup form (you can check on http://database.n2h2.com/ to see if the page is still returning the same results, although the more obvious errors will probably be fixed after this article is published):
- The Electronic Frontier Foundation, Austin chapter (screen cap)
- Cretans of Houston (screen cap). That's Cretans, as in "people from the island of Crete". Not to be confused with the Cretins of Houston, located here.
- The Rhode Island Coalition Against Domestic Violence (screen cap)
- The website of the public art galleries of British Columbia, Canada (screen cap)
- Rail2000, now the Bay Rail Alliance, a consumer group lobbying for a San Francisco regional rail system (screen cap)
- Rainbow Service Organization, a gay rights advocacy group (screen cap)
- GardenMentors.com, a custom gardening services company in Seattle (screen cap)
- A web site for Catalina 380 series boats (screen cap)
- Open Source ERP, a site promoting open source software for enterprise resource planning and customer relationship management (screen cap)
- The Bryn Mawr Mainliners, a barbershop harmony group (screen cap)
- Timber Trails, an outdoor recreation site (screen cap)
- The MEFTA Institute: "Middle East Free Trade Areas for Business Peace" -- world peace through cheap oil! (screen cap)
- Topple Rummy, a (somewhat out-of-date) site calling for the ouster of Donald Rumsfeld (screen cap)
- The Alabama Network of Children's Advocacy Centers (screen cap)
- PSARA, a non-profit organization for training cruise travel agents (screen cap)
- Park Place Behavioral Health Care, a non-profit mental health care agency (screen cap)
- The Oklahoma chapter of the American Institute of Building Design (screen cap)
- The Boys & Girls Clubs of Metropolitan Phoenix (screen cap)
- CEMTACH -- Computational ElectroMagnetics Theory-Algorithm-Code-Hardware. "Our goal is to develop systems simulations capabilities based on time-domain computational electromagnetics methods." Thanks for clearing that up. (screen cap)
- Fund for Humanity, a San Francisco non-profit supporting environmental organizations and organizations that assist the poor. (screen cap)
A long-standing point of contention while earlier reports about Bess were coming out, was whether every site on their blacklist had been reviewed by a human before being blocked. In 1998 the CEO testified before Congress that "All sites that are blocked are reviewed by N2H2 staff before being added to the block lists." However in their 2002 annual report the company finally admitted that not all sites were reviewed before being blocked: "Through automated categorization or human review, Web sites are identified as fitting into one or more of our categories". At one point an N2H2 employee also told me that when one site is blocked, they will often block all sites hosted on that machine or at that IP -- which of course means that those sites are also not reviewed before being blocked. In any case, it's possible to access some of these sites by IP address, such as the BC Art Galleries site via this link, or the or the Rhode Island Coalition Against Domestic Violence via this link -- so if they're not sharing their IP with other sites, that wouldn't explain how they got blocked either. Smartfilter spokesperson Tomo Foote-Lennox said that one other blocked URL that I found, http://www.arbiol.org/, was the result of an experiment N2H2 once did with fully automated website ratings.
Foote-Lennox added, "In general, we find that schools are VERY sensitive to under-blocking. The would rather block a whole lot of useful reference sites to avoid exposing one porn site." Probably true, although keep in mind we're talking about liability issues, not actual moral outrage. (If they were really morally outraged, they'd be trying to keep kids away from uncensored Internet access everywhere, not just in school! That is in fact the approach that schools take with things like drugs, which do inspire moral outrage because they really are harmful.) Perhaps what is needed is a law explicitly shielding schools from all liability for what students do or see on the Internet at school, if the faculty had no knowledge of it.
(Obligatory interstitial advertisement for common sense: I still don't see what the big deal is about porn anyway. Ask yourself: Why is it harmful to see a picture of a naked person, or even a picture of people having sex? And try to find an answer to that question that doesn't involve, "Lots of other people think so." That includes all variations like "Our society has determined...", "We as a people have decided...", which are just re-phrasings of "Lots of other people think so." I submit that if you disallow those variations of grownup-peer-pressure as an excuse, most people can't really come up with any reason at all.)
OK, flame-retardant suit off, lab coat back on. Previous reports have listed absurd examples of sites blocked by Bess, and looking at any one of those examples or the ones listed here, I'd say that in terms of public policy discussions -- specifically, whether a blocking software company should be trusted to decide what students can look at -- any one of these blocked sites would be more significant than, say, the blocking of BoingBoing which got so much attention. BoingBoing got blocked because of a non-sexual picture of a bare breast on the cover of one of the books they reviewed -- and in fact they were blocked only in the "nudity" category, which includes only "non-pornographic images of the bare human body". So the block on BoingBoing really only revealed that Secure Computing was a bit heavy-handed. (The real problem is that SmartFilter has the category for non-pornographic nudity blocked by default, even though the CIPA filtering law certainly doesn't require schools to block non-pornographic artistic images!) On the other hand, the fact that EFF Austin and the Rhode Island Coalition Against Domestic Violence are currently blocked as "Pornography", suggests that in many instances the blocking companies have nobody at the controls at all. To focus on stupid-but-not-completely-insane blocks like BoingBoing is letting them off easy.
So why did the laundry lists of blocked sites released over the years never become as widely known as BoingBoing, or the guffaw-inducing examples like "Beaver College", which had to change their name in part because of students reportedly being blocked from accessing their website? I think it's because the news favors a good "punch line" -- a fact that anybody can understand that makes us feel smarter than the computers making these dumb mistakes. "Oh, I get it, it was blocked because it was called Beaver College!" But the "punch line" anecdotes are precisely the ones that let the blocking companies off lightly, because it gives them a plausible-sounding excuse for making an error. On the other hand, when the Rhode Island Coalition Against Domestic Violence gets blocked as "Pornography", that could probably force the blocking company to answer some tough questions if it got more press, but there's no good punch line there, so the story just fizzles.
So, while I'm looking through the rest of the data, let me try and come up with some punch lines for reporters to make these blocked sites newsworthy. OK: Why was GardenMentor.com blocked? To keep kids away from all the dirty bitches and hoes! Get it? Ha ha! Why was the Catalina 380 yachting site blocked from kids? Because teens are too vulnerable to pier pressure! Hey, where are you going?
-
From Bess to Worse
Frequent Slashdot contributor Bennett Haselton writes " From about 1996 to 2003, there were regular reports listing examples of sites stupidly blocked by blocking software. The genre has tapered off recently, probably as a result of the Supreme Court ruling in 2003 that the Children's Internet Protection Act (CIPA) was constitutional, requiring blocking software in schools and libraries that receive federal funds, despite all the evidence of over-blocking presented at the trial. The last high-profile story about a site blocked by blocking software was about the blocking of BoingBoing almost a year ago. But the lack of recent reports on blocking software errors doesn't mean that the software has gotten better." The rest of his essay follows.One product that generated several reports over the years was "Bess, the Internet Retriever" from N2H2, which has since been bought out by Secure Computing, which also makes a blocking program called SmartFilter (the one that blocked BoingBoing) and now sells "SmartFilter, Bess Edition" which uses the same database as Bess. Different organizations and individuals published a series of investigative reports about Bess from 1997 until 2002, listing sites about gay rights, eating disorders, and other subjects that were blocked as "pornography". In Ben Edelman's supplemental report, submitted as testimony in the CIPA trial, he listed examples of erroneously blocked sites that he had reported to N2H2 in his first expert report, and which were still being blocked five months later.
Since Bess represents a set of data points showing how the accuracy of a blocking program can change, or not change, over the years, recently I began testing it again. I didn't know whether to expect it to be better or worse. On the one hand, advances in technology and greater revenue to censorware companies could have caused the software to improve. On the other hand, the number of Web pages, and the rate at which dynamic sites like blogs change content every day, has exploded. The result? I'm still tabulating data, but it looks as if the accuracy rate is roughly the same as it was in 2000, when about 30% of blocked sites were obvious errors. Then and now, I found most of the errors by starting with a large list of URLs culled from search engines and other sources, and simply running them through the software to see what was blocked.
Here is a partial list of some of the questionable categorizations made by Bess; as of this writing, all of the following sites are listed as "Pornography" when you look them up on Secure Computing's Bess lookup form. (This is not just a fluke of the lookup tool; I tested against a copy of the software that all of these sites really were blocked.) The "screen cap" link next to each site links to a snapshot of the results taken from the lookup form (you can check on http://database.n2h2.com/ to see if the page is still returning the same results, although the more obvious errors will probably be fixed after this article is published):
- The Electronic Frontier Foundation, Austin chapter (screen cap)
- Cretans of Houston (screen cap). That's Cretans, as in "people from the island of Crete". Not to be confused with the Cretins of Houston, located here.
- The Rhode Island Coalition Against Domestic Violence (screen cap)
- The website of the public art galleries of British Columbia, Canada (screen cap)
- Rail2000, now the Bay Rail Alliance, a consumer group lobbying for a San Francisco regional rail system (screen cap)
- Rainbow Service Organization, a gay rights advocacy group (screen cap)
- GardenMentors.com, a custom gardening services company in Seattle (screen cap)
- A web site for Catalina 380 series boats (screen cap)
- Open Source ERP, a site promoting open source software for enterprise resource planning and customer relationship management (screen cap)
- The Bryn Mawr Mainliners, a barbershop harmony group (screen cap)
- Timber Trails, an outdoor recreation site (screen cap)
- The MEFTA Institute: "Middle East Free Trade Areas for Business Peace" -- world peace through cheap oil! (screen cap)
- Topple Rummy, a (somewhat out-of-date) site calling for the ouster of Donald Rumsfeld (screen cap)
- The Alabama Network of Children's Advocacy Centers (screen cap)
- PSARA, a non-profit organization for training cruise travel agents (screen cap)
- Park Place Behavioral Health Care, a non-profit mental health care agency (screen cap)
- The Oklahoma chapter of the American Institute of Building Design (screen cap)
- The Boys & Girls Clubs of Metropolitan Phoenix (screen cap)
- CEMTACH -- Computational ElectroMagnetics Theory-Algorithm-Code-Hardware. "Our goal is to develop systems simulations capabilities based on time-domain computational electromagnetics methods." Thanks for clearing that up. (screen cap)
- Fund for Humanity, a San Francisco non-profit supporting environmental organizations and organizations that assist the poor. (screen cap)
A long-standing point of contention while earlier reports about Bess were coming out, was whether every site on their blacklist had been reviewed by a human before being blocked. In 1998 the CEO testified before Congress that "All sites that are blocked are reviewed by N2H2 staff before being added to the block lists." However in their 2002 annual report the company finally admitted that not all sites were reviewed before being blocked: "Through automated categorization or human review, Web sites are identified as fitting into one or more of our categories". At one point an N2H2 employee also told me that when one site is blocked, they will often block all sites hosted on that machine or at that IP -- which of course means that those sites are also not reviewed before being blocked. In any case, it's possible to access some of these sites by IP address, such as the BC Art Galleries site via this link, or the or the Rhode Island Coalition Against Domestic Violence via this link -- so if they're not sharing their IP with other sites, that wouldn't explain how they got blocked either. Smartfilter spokesperson Tomo Foote-Lennox said that one other blocked URL that I found, http://www.arbiol.org/, was the result of an experiment N2H2 once did with fully automated website ratings.
Foote-Lennox added, "In general, we find that schools are VERY sensitive to under-blocking. The would rather block a whole lot of useful reference sites to avoid exposing one porn site." Probably true, although keep in mind we're talking about liability issues, not actual moral outrage. (If they were really morally outraged, they'd be trying to keep kids away from uncensored Internet access everywhere, not just in school! That is in fact the approach that schools take with things like drugs, which do inspire moral outrage because they really are harmful.) Perhaps what is needed is a law explicitly shielding schools from all liability for what students do or see on the Internet at school, if the faculty had no knowledge of it.
(Obligatory interstitial advertisement for common sense: I still don't see what the big deal is about porn anyway. Ask yourself: Why is it harmful to see a picture of a naked person, or even a picture of people having sex? And try to find an answer to that question that doesn't involve, "Lots of other people think so." That includes all variations like "Our society has determined...", "We as a people have decided...", which are just re-phrasings of "Lots of other people think so." I submit that if you disallow those variations of grownup-peer-pressure as an excuse, most people can't really come up with any reason at all.)
OK, flame-retardant suit off, lab coat back on. Previous reports have listed absurd examples of sites blocked by Bess, and looking at any one of those examples or the ones listed here, I'd say that in terms of public policy discussions -- specifically, whether a blocking software company should be trusted to decide what students can look at -- any one of these blocked sites would be more significant than, say, the blocking of BoingBoing which got so much attention. BoingBoing got blocked because of a non-sexual picture of a bare breast on the cover of one of the books they reviewed -- and in fact they were blocked only in the "nudity" category, which includes only "non-pornographic images of the bare human body". So the block on BoingBoing really only revealed that Secure Computing was a bit heavy-handed. (The real problem is that SmartFilter has the category for non-pornographic nudity blocked by default, even though the CIPA filtering law certainly doesn't require schools to block non-pornographic artistic images!) On the other hand, the fact that EFF Austin and the Rhode Island Coalition Against Domestic Violence are currently blocked as "Pornography", suggests that in many instances the blocking companies have nobody at the controls at all. To focus on stupid-but-not-completely-insane blocks like BoingBoing is letting them off easy.
So why did the laundry lists of blocked sites released over the years never become as widely known as BoingBoing, or the guffaw-inducing examples like "Beaver College", which had to change their name in part because of students reportedly being blocked from accessing their website? I think it's because the news favors a good "punch line" -- a fact that anybody can understand that makes us feel smarter than the computers making these dumb mistakes. "Oh, I get it, it was blocked because it was called Beaver College!" But the "punch line" anecdotes are precisely the ones that let the blocking companies off lightly, because it gives them a plausible-sounding excuse for making an error. On the other hand, when the Rhode Island Coalition Against Domestic Violence gets blocked as "Pornography", that could probably force the blocking company to answer some tough questions if it got more press, but there's no good punch line there, so the story just fizzles.
So, while I'm looking through the rest of the data, let me try and come up with some punch lines for reporters to make these blocked sites newsworthy. OK: Why was GardenMentor.com blocked? To keep kids away from all the dirty bitches and hoes! Get it? Ha ha! Why was the Catalina 380 yachting site blocked from kids? Because teens are too vulnerable to pier pressure! Hey, where are you going?
-
From Bess to Worse
Frequent Slashdot contributor Bennett Haselton writes " From about 1996 to 2003, there were regular reports listing examples of sites stupidly blocked by blocking software. The genre has tapered off recently, probably as a result of the Supreme Court ruling in 2003 that the Children's Internet Protection Act (CIPA) was constitutional, requiring blocking software in schools and libraries that receive federal funds, despite all the evidence of over-blocking presented at the trial. The last high-profile story about a site blocked by blocking software was about the blocking of BoingBoing almost a year ago. But the lack of recent reports on blocking software errors doesn't mean that the software has gotten better." The rest of his essay follows.One product that generated several reports over the years was "Bess, the Internet Retriever" from N2H2, which has since been bought out by Secure Computing, which also makes a blocking program called SmartFilter (the one that blocked BoingBoing) and now sells "SmartFilter, Bess Edition" which uses the same database as Bess. Different organizations and individuals published a series of investigative reports about Bess from 1997 until 2002, listing sites about gay rights, eating disorders, and other subjects that were blocked as "pornography". In Ben Edelman's supplemental report, submitted as testimony in the CIPA trial, he listed examples of erroneously blocked sites that he had reported to N2H2 in his first expert report, and which were still being blocked five months later.
Since Bess represents a set of data points showing how the accuracy of a blocking program can change, or not change, over the years, recently I began testing it again. I didn't know whether to expect it to be better or worse. On the one hand, advances in technology and greater revenue to censorware companies could have caused the software to improve. On the other hand, the number of Web pages, and the rate at which dynamic sites like blogs change content every day, has exploded. The result? I'm still tabulating data, but it looks as if the accuracy rate is roughly the same as it was in 2000, when about 30% of blocked sites were obvious errors. Then and now, I found most of the errors by starting with a large list of URLs culled from search engines and other sources, and simply running them through the software to see what was blocked.
Here is a partial list of some of the questionable categorizations made by Bess; as of this writing, all of the following sites are listed as "Pornography" when you look them up on Secure Computing's Bess lookup form. (This is not just a fluke of the lookup tool; I tested against a copy of the software that all of these sites really were blocked.) The "screen cap" link next to each site links to a snapshot of the results taken from the lookup form (you can check on http://database.n2h2.com/ to see if the page is still returning the same results, although the more obvious errors will probably be fixed after this article is published):
- The Electronic Frontier Foundation, Austin chapter (screen cap)
- Cretans of Houston (screen cap). That's Cretans, as in "people from the island of Crete". Not to be confused with the Cretins of Houston, located here.
- The Rhode Island Coalition Against Domestic Violence (screen cap)
- The website of the public art galleries of British Columbia, Canada (screen cap)
- Rail2000, now the Bay Rail Alliance, a consumer group lobbying for a San Francisco regional rail system (screen cap)
- Rainbow Service Organization, a gay rights advocacy group (screen cap)
- GardenMentors.com, a custom gardening services company in Seattle (screen cap)
- A web site for Catalina 380 series boats (screen cap)
- Open Source ERP, a site promoting open source software for enterprise resource planning and customer relationship management (screen cap)
- The Bryn Mawr Mainliners, a barbershop harmony group (screen cap)
- Timber Trails, an outdoor recreation site (screen cap)
- The MEFTA Institute: "Middle East Free Trade Areas for Business Peace" -- world peace through cheap oil! (screen cap)
- Topple Rummy, a (somewhat out-of-date) site calling for the ouster of Donald Rumsfeld (screen cap)
- The Alabama Network of Children's Advocacy Centers (screen cap)
- PSARA, a non-profit organization for training cruise travel agents (screen cap)
- Park Place Behavioral Health Care, a non-profit mental health care agency (screen cap)
- The Oklahoma chapter of the American Institute of Building Design (screen cap)
- The Boys & Girls Clubs of Metropolitan Phoenix (screen cap)
- CEMTACH -- Computational ElectroMagnetics Theory-Algorithm-Code-Hardware. "Our goal is to develop systems simulations capabilities based on time-domain computational electromagnetics methods." Thanks for clearing that up. (screen cap)
- Fund for Humanity, a San Francisco non-profit supporting environmental organizations and organizations that assist the poor. (screen cap)
A long-standing point of contention while earlier reports about Bess were coming out, was whether every site on their blacklist had been reviewed by a human before being blocked. In 1998 the CEO testified before Congress that "All sites that are blocked are reviewed by N2H2 staff before being added to the block lists." However in their 2002 annual report the company finally admitted that not all sites were reviewed before being blocked: "Through automated categorization or human review, Web sites are identified as fitting into one or more of our categories". At one point an N2H2 employee also told me that when one site is blocked, they will often block all sites hosted on that machine or at that IP -- which of course means that those sites are also not reviewed before being blocked. In any case, it's possible to access some of these sites by IP address, such as the BC Art Galleries site via this link, or the or the Rhode Island Coalition Against Domestic Violence via this link -- so if they're not sharing their IP with other sites, that wouldn't explain how they got blocked either. Smartfilter spokesperson Tomo Foote-Lennox said that one other blocked URL that I found, http://www.arbiol.org/, was the result of an experiment N2H2 once did with fully automated website ratings.
Foote-Lennox added, "In general, we find that schools are VERY sensitive to under-blocking. The would rather block a whole lot of useful reference sites to avoid exposing one porn site." Probably true, although keep in mind we're talking about liability issues, not actual moral outrage. (If they were really morally outraged, they'd be trying to keep kids away from uncensored Internet access everywhere, not just in school! That is in fact the approach that schools take with things like drugs, which do inspire moral outrage because they really are harmful.) Perhaps what is needed is a law explicitly shielding schools from all liability for what students do or see on the Internet at school, if the faculty had no knowledge of it.
(Obligatory interstitial advertisement for common sense: I still don't see what the big deal is about porn anyway. Ask yourself: Why is it harmful to see a picture of a naked person, or even a picture of people having sex? And try to find an answer to that question that doesn't involve, "Lots of other people think so." That includes all variations like "Our society has determined...", "We as a people have decided...", which are just re-phrasings of "Lots of other people think so." I submit that if you disallow those variations of grownup-peer-pressure as an excuse, most people can't really come up with any reason at all.)
OK, flame-retardant suit off, lab coat back on. Previous reports have listed absurd examples of sites blocked by Bess, and looking at any one of those examples or the ones listed here, I'd say that in terms of public policy discussions -- specifically, whether a blocking software company should be trusted to decide what students can look at -- any one of these blocked sites would be more significant than, say, the blocking of BoingBoing which got so much attention. BoingBoing got blocked because of a non-sexual picture of a bare breast on the cover of one of the books they reviewed -- and in fact they were blocked only in the "nudity" category, which includes only "non-pornographic images of the bare human body". So the block on BoingBoing really only revealed that Secure Computing was a bit heavy-handed. (The real problem is that SmartFilter has the category for non-pornographic nudity blocked by default, even though the CIPA filtering law certainly doesn't require schools to block non-pornographic artistic images!) On the other hand, the fact that EFF Austin and the Rhode Island Coalition Against Domestic Violence are currently blocked as "Pornography", suggests that in many instances the blocking companies have nobody at the controls at all. To focus on stupid-but-not-completely-insane blocks like BoingBoing is letting them off easy.
So why did the laundry lists of blocked sites released over the years never become as widely known as BoingBoing, or the guffaw-inducing examples like "Beaver College", which had to change their name in part because of students reportedly being blocked from accessing their website? I think it's because the news favors a good "punch line" -- a fact that anybody can understand that makes us feel smarter than the computers making these dumb mistakes. "Oh, I get it, it was blocked because it was called Beaver College!" But the "punch line" anecdotes are precisely the ones that let the blocking companies off lightly, because it gives them a plausible-sounding excuse for making an error. On the other hand, when the Rhode Island Coalition Against Domestic Violence gets blocked as "Pornography", that could probably force the blocking company to answer some tough questions if it got more press, but there's no good punch line there, so the story just fizzles.
So, while I'm looking through the rest of the data, let me try and come up with some punch lines for reporters to make these blocked sites newsworthy. OK: Why was GardenMentor.com blocked? To keep kids away from all the dirty bitches and hoes! Get it? Ha ha! Why was the Catalina 380 yachting site blocked from kids? Because teens are too vulnerable to pier pressure! Hey, where are you going?
-
From Bess to Worse
Frequent Slashdot contributor Bennett Haselton writes " From about 1996 to 2003, there were regular reports listing examples of sites stupidly blocked by blocking software. The genre has tapered off recently, probably as a result of the Supreme Court ruling in 2003 that the Children's Internet Protection Act (CIPA) was constitutional, requiring blocking software in schools and libraries that receive federal funds, despite all the evidence of over-blocking presented at the trial. The last high-profile story about a site blocked by blocking software was about the blocking of BoingBoing almost a year ago. But the lack of recent reports on blocking software errors doesn't mean that the software has gotten better." The rest of his essay follows.One product that generated several reports over the years was "Bess, the Internet Retriever" from N2H2, which has since been bought out by Secure Computing, which also makes a blocking program called SmartFilter (the one that blocked BoingBoing) and now sells "SmartFilter, Bess Edition" which uses the same database as Bess. Different organizations and individuals published a series of investigative reports about Bess from 1997 until 2002, listing sites about gay rights, eating disorders, and other subjects that were blocked as "pornography". In Ben Edelman's supplemental report, submitted as testimony in the CIPA trial, he listed examples of erroneously blocked sites that he had reported to N2H2 in his first expert report, and which were still being blocked five months later.
Since Bess represents a set of data points showing how the accuracy of a blocking program can change, or not change, over the years, recently I began testing it again. I didn't know whether to expect it to be better or worse. On the one hand, advances in technology and greater revenue to censorware companies could have caused the software to improve. On the other hand, the number of Web pages, and the rate at which dynamic sites like blogs change content every day, has exploded. The result? I'm still tabulating data, but it looks as if the accuracy rate is roughly the same as it was in 2000, when about 30% of blocked sites were obvious errors. Then and now, I found most of the errors by starting with a large list of URLs culled from search engines and other sources, and simply running them through the software to see what was blocked.
Here is a partial list of some of the questionable categorizations made by Bess; as of this writing, all of the following sites are listed as "Pornography" when you look them up on Secure Computing's Bess lookup form. (This is not just a fluke of the lookup tool; I tested against a copy of the software that all of these sites really were blocked.) The "screen cap" link next to each site links to a snapshot of the results taken from the lookup form (you can check on http://database.n2h2.com/ to see if the page is still returning the same results, although the more obvious errors will probably be fixed after this article is published):
- The Electronic Frontier Foundation, Austin chapter (screen cap)
- Cretans of Houston (screen cap). That's Cretans, as in "people from the island of Crete". Not to be confused with the Cretins of Houston, located here.
- The Rhode Island Coalition Against Domestic Violence (screen cap)
- The website of the public art galleries of British Columbia, Canada (screen cap)
- Rail2000, now the Bay Rail Alliance, a consumer group lobbying for a San Francisco regional rail system (screen cap)
- Rainbow Service Organization, a gay rights advocacy group (screen cap)
- GardenMentors.com, a custom gardening services company in Seattle (screen cap)
- A web site for Catalina 380 series boats (screen cap)
- Open Source ERP, a site promoting open source software for enterprise resource planning and customer relationship management (screen cap)
- The Bryn Mawr Mainliners, a barbershop harmony group (screen cap)
- Timber Trails, an outdoor recreation site (screen cap)
- The MEFTA Institute: "Middle East Free Trade Areas for Business Peace" -- world peace through cheap oil! (screen cap)
- Topple Rummy, a (somewhat out-of-date) site calling for the ouster of Donald Rumsfeld (screen cap)
- The Alabama Network of Children's Advocacy Centers (screen cap)
- PSARA, a non-profit organization for training cruise travel agents (screen cap)
- Park Place Behavioral Health Care, a non-profit mental health care agency (screen cap)
- The Oklahoma chapter of the American Institute of Building Design (screen cap)
- The Boys & Girls Clubs of Metropolitan Phoenix (screen cap)
- CEMTACH -- Computational ElectroMagnetics Theory-Algorithm-Code-Hardware. "Our goal is to develop systems simulations capabilities based on time-domain computational electromagnetics methods." Thanks for clearing that up. (screen cap)
- Fund for Humanity, a San Francisco non-profit supporting environmental organizations and organizations that assist the poor. (screen cap)
A long-standing point of contention while earlier reports about Bess were coming out, was whether every site on their blacklist had been reviewed by a human before being blocked. In 1998 the CEO testified before Congress that "All sites that are blocked are reviewed by N2H2 staff before being added to the block lists." However in their 2002 annual report the company finally admitted that not all sites were reviewed before being blocked: "Through automated categorization or human review, Web sites are identified as fitting into one or more of our categories". At one point an N2H2 employee also told me that when one site is blocked, they will often block all sites hosted on that machine or at that IP -- which of course means that those sites are also not reviewed before being blocked. In any case, it's possible to access some of these sites by IP address, such as the BC Art Galleries site via this link, or the or the Rhode Island Coalition Against Domestic Violence via this link -- so if they're not sharing their IP with other sites, that wouldn't explain how they got blocked either. Smartfilter spokesperson Tomo Foote-Lennox said that one other blocked URL that I found, http://www.arbiol.org/, was the result of an experiment N2H2 once did with fully automated website ratings.
Foote-Lennox added, "In general, we find that schools are VERY sensitive to under-blocking. The would rather block a whole lot of useful reference sites to avoid exposing one porn site." Probably true, although keep in mind we're talking about liability issues, not actual moral outrage. (If they were really morally outraged, they'd be trying to keep kids away from uncensored Internet access everywhere, not just in school! That is in fact the approach that schools take with things like drugs, which do inspire moral outrage because they really are harmful.) Perhaps what is needed is a law explicitly shielding schools from all liability for what students do or see on the Internet at school, if the faculty had no knowledge of it.
(Obligatory interstitial advertisement for common sense: I still don't see what the big deal is about porn anyway. Ask yourself: Why is it harmful to see a picture of a naked person, or even a picture of people having sex? And try to find an answer to that question that doesn't involve, "Lots of other people think so." That includes all variations like "Our society has determined...", "We as a people have decided...", which are just re-phrasings of "Lots of other people think so." I submit that if you disallow those variations of grownup-peer-pressure as an excuse, most people can't really come up with any reason at all.)
OK, flame-retardant suit off, lab coat back on. Previous reports have listed absurd examples of sites blocked by Bess, and looking at any one of those examples or the ones listed here, I'd say that in terms of public policy discussions -- specifically, whether a blocking software company should be trusted to decide what students can look at -- any one of these blocked sites would be more significant than, say, the blocking of BoingBoing which got so much attention. BoingBoing got blocked because of a non-sexual picture of a bare breast on the cover of one of the books they reviewed -- and in fact they were blocked only in the "nudity" category, which includes only "non-pornographic images of the bare human body". So the block on BoingBoing really only revealed that Secure Computing was a bit heavy-handed. (The real problem is that SmartFilter has the category for non-pornographic nudity blocked by default, even though the CIPA filtering law certainly doesn't require schools to block non-pornographic artistic images!) On the other hand, the fact that EFF Austin and the Rhode Island Coalition Against Domestic Violence are currently blocked as "Pornography", suggests that in many instances the blocking companies have nobody at the controls at all. To focus on stupid-but-not-completely-insane blocks like BoingBoing is letting them off easy.
So why did the laundry lists of blocked sites released over the years never become as widely known as BoingBoing, or the guffaw-inducing examples like "Beaver College", which had to change their name in part because of students reportedly being blocked from accessing their website? I think it's because the news favors a good "punch line" -- a fact that anybody can understand that makes us feel smarter than the computers making these dumb mistakes. "Oh, I get it, it was blocked because it was called Beaver College!" But the "punch line" anecdotes are precisely the ones that let the blocking companies off lightly, because it gives them a plausible-sounding excuse for making an error. On the other hand, when the Rhode Island Coalition Against Domestic Violence gets blocked as "Pornography", that could probably force the blocking company to answer some tough questions if it got more press, but there's no good punch line there, so the story just fizzles.
So, while I'm looking through the rest of the data, let me try and come up with some punch lines for reporters to make these blocked sites newsworthy. OK: Why was GardenMentor.com blocked? To keep kids away from all the dirty bitches and hoes! Get it? Ha ha! Why was the Catalina 380 yachting site blocked from kids? Because teens are too vulnerable to pier pressure! Hey, where are you going?
-
Could Open Source Lead to a Meritocratic Search Engine?
Slashdot contributor Bennett Haselton writes "When Jimmy Wales recently announced the Search Wikia project, an attempt to build an open-source search engine around the user-driven model that gave birth to Wikipedia, he said his goal was to create "the search engine that changes everything", as he underscored in a February 5 talk at New York University. I think it could, although not for the same main reasons that Wales has put forth -- I think that for a search engine to be truly meritocratic would be more of a revolution than for a search engine to be open-source, although both would be large steps forward. Indeed, if a search engine could be built that really returned results in order of average desirability to users, and resisted efforts by companies to "game" the system (even if everyone knew precisely how the ranking algorithm worked), it's hard to overstate how much that would change things both for businesses and consumers. The key question is whether such an algorithm could be created that wouldn't be vulnerable to non-merit-based manipulation. Regardless of what algorithms may be currently under consideration by thinkers within the Wikia company, I want to argue logically for some necessary properties that such an algorithm should have in order to be effective. Because if their search engine becomes popular, they will face such huge efforts from companies trying to manipulate the search results, that it will make Wikipedia vandalism look like a cakewalk." The rest of his essay follows.This will be a trip into theory-land, so it may be frustrating to users who dislike talk about "vaporware" and want to see how something works in practice. I understand where you're coming from, but I submit it's valuable to raise these questions early. This is in any case not intended to supplant discussion about how things are things are currently progressing.
First, though, consider the benefits that such a search engine could bring, both to content consumers and content providers, if it really did return results sorted according to average community preferences. Suppose you wanted to find out if you had a knack for publishing recipes online and getting some AdSense revenue on the side. You take a recipe that you know, like apple pie, and check out the current results for "apple pie". There are some pretty straightforward recipes online, but you believe you can create a more complete and user-friendly one. So you write up your own recipe, complete with photographs of the process showing how ingredients should be chopped and what the crust mixture should look like, so that the steps are easier to follow. (Don't you hate it when a recipe says "cut into cubes" and you want to throttle the author and shout, "HOW BIG??" It drove me crazy until I found CookingForEngineers.com.) Anyway, you submit your recipe to the search engine to be included in the results for "apple pie", and if the sorting process is truly meritocratic, your recipe page rises to the top. Until, that is, someone decides to surpass you, and publishes an even more user-friendly recipe, perhaps with a link to a YouTube video of them showing how to make the pie, which they shot with a tripod video camera and a clip-on mike in their well-lit kitchen. In a world of perfect competition, content providers would be constantly leapfrogging each other with better and better content within each category (even a highly specific one like apple pie recipes), until further efforts would no longer pay for themselves with increased traffic revenue. (The more popular search terms, of course, would bring greater rewards for those listed at the top, and would be able to pay for greater efforts to improve the content within that category.) But this constant leapfrogging of better and better content requires efficient and speedy sorting of search results in order to work. It doesn't work if the search results can be gamed by someone willing to spend effort and money (not worth it for the author of a single apple pie recipe, but worth it for a big money-making recipe site), and it doesn't work if it's impossible for new entrants to get hits when the established players already dominate search results.
Efficient competition benefits consumers even more for results that are sorted by price (assuming that among comparable goods and services, the community promotes the cheapest-selling ones to the top of the search results, as "most desirable"). If you were a company selling dedicated Web hosting, for example, you would submit your site to the engine to be included in results for "dedicated hosting". If you could demonstrate to the community that your prices and services were superior to your competitors', and if the ranking algorithm really did rank sites according to the preferences of the average user, your site could quickly rise to the top, and you'd make a bundle on new sales -- until, of course, someone else had the same idea and knocked you out of the top spot by lowering their prices or improving their services. The more efficient the marketplace, the faster prices fall and service levels rise, until the prices just covered the cost of providing the service and compensating the business owner for their time. It would be a pure buyer's market.
It's important to precisely answer the question: Why would this system be better than a system like Google's search algorithm, which can be "gamed" by enterprising businesses and which doesn't always return the results first that the user would like the most? You might be tempted to answer that in an inefficient marketplace created by an inefficient search result sorting algorithm, a user sometimes ends up paying $79/month for hosting, instead of the $29/month that they might pay if the marketplace were perfectly efficient. But this by itself is not necessarily wasteful. The extra $50 that the user pays is the user's loss, but it's also the hosting company's gain. If we consider costs and benefits across all parties, the two cancel out. The world as a whole is not poorer because someone overpaid for hosting.
The real losses caused by an inefficient search algorithm, are the efforts spent by companies to game the search results (e.g. paying search engine optimization firms to try and get them to the top Google spot), and the reluctance of new players to enter that market if they don't have the resources to play those games. If two companies each spend $5,000 trying to knock each other off of the top spot for a search like "weddings", that's $5,000 worth of effort that gets burned up with no offsetting amount of goods and services added to the world. This is what economists call a deadweight loss, with no corresponding benefit to any party. The two wedding planners might as well have smashed their pastel cars into each other. Even if a single company spends the effort and money to move from position #50 to position #1, that gain to them is offset by the loss to the other 49 companies that each moved down by one position, so the net benefit across all parties is zero, and the effort that the company spent to raise their position would still be a deadweight loss.
On the other hand, if search engine results were sorted according to a true meritocracy, then companies that wanted to raise their rankings would have to spend effort improving their services instead. This is not a deadweight loss, since these efforts result in benefits or savings to the consumer.
I've been a member of several online entrepreneur communities, and I'd conservatively estimate that members spend less than 10% of the time talking about actually improving products and services, and more than 90% of the time talking about how to "game" the various systems that people use to find them, such as search engines and the media. I don't blame them, of course; they're just doing what's best for their company, in the inefficient marketplace that we live in. But I feel almost lethargic thinking of that 90% of effort that gets spent on activities that produce no new goods and services. What if the information marketplace really were efficient, and business owners spent nearly 100% of their efforts improving goods and services, so that every ounce of effort added new value to the world?
Think of how differently we'd approach the problem of creating a new Web site and driving traffic to it. A good programmer with a good idea could literally become an overnight success. If you had more modest goals, you could shoot a video of yourself preparing a recipe or teaching a magic trick, and just throw it out there and watch it bubble its way up the meritocracy to see if it was any good. You wouldn't have to spend any time networking or trying to rig the results, you just create good stuff and put it out there. No, despite whatever cheer-leading you may have heard, it doesn't quite work that way yet -- good online businessmen still talk about the importance of networking, advertising, and all the other components of gaming the system that don't relate to actually improving products and services. But there is no reason, in principle, why a perfectly meritocratic content-sorting engine couldn't be built. Would it revolutionize content on the Internet? And, could Search Wikia be the project to do it, or play a part in it?
Whatever search engine the Wikia company produced, it would probably have such a large following among the built-in open-source and Wikipedia fan base, that traffic wouldn't be a problem -- companies at the top of popular search results would definitely benefit. The question is whether the system can be designed so that it cannot be gamed. I agree with Jimmy Wales's stated intention to make the algorithm completely open, since this makes it easier for helpful third parties to find weaknesses and get them fixed, but of course it also makes it easier for attackers to find those weaknesses and exploit them. If you think Microsoft paying a blogger to edit Wikipedia is a problem, imagine what companies will do to try and manipulate the search results for a term like "mortgage". So what can be done?
The basic problem with any community that makes important decisions by "consensus" is that it can be manipulated by someone who creates multiple phantom accounts all under their control. Then if a decision is influenced by voting -- for example, the relative position of a given site in a list of search results -- then the attacker can have the phantom accounts all vote for one preferred site. You can look for large numbers of accounts created from the same IP address, but the attacker could use Tor and similar systems to appear to be coming from different IPs. You could attempt to verify the unique identity of each account holder, by phone for example, but this requires a lot of effort and would alienate privacy-conscious users. You could require a Turing test for each new account, but all this means is that an attacker couldn't use a script to create their 1,000 accounts -- an attacker could still create the accounts if they had enough time, or if they paid some kid in India to create the accounts. You could give users voting power in proportion to some kind of "karma" that they had built up over time by using the site, but this gives new users little influence and little incentive to participate; it also does nothing to stop influential users from "selling out" their votes (either because they became disillusioned, or because they signed up with that as their intent from the beginning!).
So, any algorithm designed to protect the integrity of the Search Wikia results would have to deal with this type of attack. In a recent article about Citizendium, a proposed Wikipedia alternative, I argued that you could deal with conventional wiki vandalism by having identity-verified experts sign off on the accuracy of an article at different stages. That's practical for a subject like biology, where you could have a group of experts whose collective knowledge covers the subject at the depth expected in an encyclopedia, but probably not for a topic like "dedicated hosting" where the task is to sift through tens of thousands of potential matches and find the best ones to list first. You need a new algorithm to harness the power of the community. I don't know how many possible solutions there are, but here is one way in which it could be done.
Suppose a user submits a requested change to the search results -- the addition of their new Site A, or the proposal that Site A should be ranked higher. This decision could be reviewed by a small subset of registered users, selected at random from the entire user population. If a majority of the users rate the new site highly enough as a relevant result for a particular term, then the site gets a high ranking. If not, then the site is given a low ranking, possibly with feedback being sent to the submitter as to why the site was not rated highly. The key is that the users who vote on the site have to be selected at random from among all users, instead of letting users self-select to vote on a particular decision.
The nice property of this system is that an attacker can't manipulate the voting simply by having a large number of accounts at their control -- they would have to control a significant proportion of accounts across the entire user population, in order to ensure that when the voters were selected randomly from the user population, the attacker controlled enough of those accounts to influence the outcome. (If an attacker ever really did spend the resources to reach that threshold point, and it became apparent that they were manipulating the votes, those votes could be challenged and overridden by a vote of users whose identities were known to the system. This would allow the verified-identity users to be used as an appeal of last resort to block abuse by a very dedicated adversary, while not requiring most users to verify their identity. This is basically what Jimmy Wales does when he steps in and arbitrates a Wikipedia dispute, acting as his own "user whose identity is known".)
This algorithm for an "automated meritocracy" (automeritocracy? still not very catchy at 7 syllables) could be extended to other types of user-built content sites as well. Musicians could submit songs to a peer review site, and the songs would be pushed out to a random subset of users interested in that genre, who would then vote on the songs. (If most users were too apathetic to vote, the site could tabulate the number of people who heard the song and then proceeded to buy or download it, and count those as "votes" in favor.) If the votes for the song are high enough, it gets pushed out to all users interested in that genre; if not, then the song doesn't make it past the first stage. If there are 100,000 users subscribed to a particular genre, but it only takes ratings from 100 users to determine whether or not a song is worth pushing out to everybody, that means that when "good" content is sent out to all 100,000 people but "bad" content only wastes the time of 100 users, the average user gets 1,000 pieces of "good" content for every 1 piece of "bad" content. New musicians wouldn't have to spend any time networking, promoting, recruiting friends to vote for them -- all of which have nothing to do with making the music better, and which fall into the category of deadweight losses described above.
An automeritocracy-like system could even be used as a spam filter for a large e-mail site. Suppose you want to send your newsletter to 100,000 Hotmail users (who really have signed up to receive it). Hotmail could allow your IP to send mail to 100,000 users the first time, and then if they receive too many spam complaints, block your future mailings as junk mail. But if that's their practice, there's nothing to stop you from moving to a new, unblocked IP and repeating the process from there. So instead, suppose that Hotmail stores your 100,000 received messages temporarily into users' "Junk Mail" folders, but selectively releases a randomly selected subset of 100 messages into users' inboxes. Suppose for arguments' sake that when a message is spam, 20% of users click the "This is spam" button, but if not, then only 1% of users click it. Out of the 100 users who see the message, if the number who click "This is spam" looks close to 1%, then since those 100 users were selected as a representative sample of the whole population, Hotmail concludes that the rest of the 100,000 messages are not spam, and moves them retroactively to users' inboxes. If the percentage of those 100 users who click "This is spam" is closer to 20%, then the rest of the 100,000 messages stay in Junk Mail. A spammer could only rig this system if they controlled a significant proportion of the 100,000 addresses on their list -- not impossible, but difficult, since you have to pass a Turing test to create each new Hotmail account.
The problem is, there's a huge difference between systems that implement this algorithm, and systems that implement something that looks superficially like this algorithm but actually isn't. Specifically, any site like HotOrNot, Digg, or Gather that lets users decide what to vote on, is vulnerable to the attack of using friends or phantom users to vote yourself up (or to vote someone else down). In a recent thread on Gather about a new contest that relied on peer ratings, many users lamented the fact that it was essentially rigged in favor of people with lots of friends who could give them a high score (or that ratings could be offset unfairly in the other direction by "revenge raters" giving you a 1 as payback for some low rating you gave them). I assume that the reason such sites were designed that way is that it just seemed natural that if your site is driven by user ratings, and if people can see a specific piece of content by visiting a URL, they should have the option on that page to vote on that content. But this unfortunately makes the system vulnerable to the phantom-users attack.
(Spam filters on sites like Hotmail also probably have the same problem. We don't know for sure what happens when the user clicks "This is spam" on a piece of mail, but it's likely that if a high enough percentage of users click "This is spam" for mail coming from a particular IP address, then future mails from that IP are blocked as spam. This means you could get your arch-rival Joe's newsletter blacklisted, by creating multiple accounts, signing them up for Joe's newsletter, and clicking "This is spam" when his newsletters come in. This is an example of the same basic flaw -- letting users choose what they want to vote on.)
So if the Wikia search site uses something like this "automeritocracy" algorithm to guard the integrity of its results, it's imperative not to use an algorithm vulnerable to the hordes-of-phantom-users attack. Some variation of selecting random voters from a large population of users would be one way to handle that.
Finally, there is a reason why it's important to pay attention to getting the algorithm right, rather than hoping that the best algorithm will just naturally "emerge" from the "marketplace of ideas" that results from different wiki-driven search sites competing with each other. The problem is that competition between such sites is itself highly inefficient -- a given user may take a long time to discover which site provides better search results on average, and in any case, it may be that Wiki-Search Site "B" has a better design but Wiki-Search Site "A" had first-mover advantage and got a larger number of registered users. When I wrote earlier about why I thought the Citizendium model was better than Wikipedia, several users pointed out that it may be a moot point, for two main reasons. First, most users will not switch to a better alternative if it never occurs to them. Second, for sites that are powered by a user community, it's very hard for a new competitor to gain ground, even with a superior design, if the success of your community depends on lots of people starting to use it all at once. You could write a better eBay or a better Match.com, but who would use it? Your target market will go to the others because that's where everybody else is. Citizendium is, I think, a special case, since they can fork articles that started life on Wikipedia, so Wikipedia doesn't have as huge of an advantage over them as they would if Citizendium had to start from scratch. But the general rule about imperfect competition still applies.
It's a chicken-and-egg problem: You can have Site A that works as a pure meritocracy, and Site B that works as an almost-meritocracy but can be gamed with some effort. But Site B may still win because the larger environment in which they compete with each other, is not itself a meritocracy. So we just have to cross our fingers and hope that Search Wikia gets it right, because if they don't, there's no guarantee that a better alternative will rise to take its place. But if they get it right, I can hardly wait to see what changes it would bring about.
-
A Wikipedia WIthout Graffiti
Frequent Slashdot Contributor Bennett Haselton writes "I'm a Wikipedia junkie. There's nothing more fun than switching back and forth between reading about the history of human evolution, and following the latest speculation about the identity of the mysterious R.A.B. in the Harry Potter books, and Wikipedia is the best site to find it all in one place. But as a fan, it's always been frustrating for me knowing that Wikipedia could never improve beyond a certain point -- as it becomes more popular, it becomes more tempting to vandalize, and in turn becomes less reliable, a point that many have made already. That's why I'm excited that sites like Citizendium are approaching the same problem with a different model, one that could enable them to become what Wikipedia almost was, but which its intrinsic nature kept it from being: a central, reliable source of freely redistributable information about almost anything. The main difference is that Citizendium articles, after initially being built up through the same collaborative process that Wikipedia uses, will go into an editor-approved stage, at which point an editor (publicly identifiable on the article's history page) signs off on the accuracy of the article, and further edits also have to be approved by an editor."Editor control over articles is controversial within the "radical collaboration" community; the Wikimedia foundation lists five "foundation issues that are essentially beyond debate", which includes "Ability of anyone to edit articles without registering". (In practice there are some safeguards in place to protect articles that are frequent targets of vandalism, like the George W. Bush entry.) But I'm fanatically results-oriented in my thinking, and I always ask: What are the purposes of this project, and how does this feature help achieve those purposes? It seems to me that a free online encyclopedia fills four main needs:
- A source of information about pop culture that can be fun to read even without being 100% sure that it's accurate (like who R.A.B. is)
- A source of information that can be freely and legally redistributed, e.g. by printing out copies for a class to read
- A source of information on subjects where you need to be close to 100% certain that the information is reliable -- at least as certain, say, as you would be if you read the same fact in several books
- A source of information that you can cite in a school paper as being reasonably authoritative and reliable
For the reliability problem, I can't improve on this priceless sentence from Wikipedia's own "Citing Wikipedia" page:
For many purposes, but particularly in academia, Wikipedia may not be considered an acceptable source. [ citation needed ]
Wikipedia has actually done much better than I would have expected -- a study done in 2005 found that Wikipedia averaged about 4 errors per article compared to Britannica's 3, which is pretty good for a site where anybody can write that Columbus sailed to the New World in ships named the Ninja, the Pinto, and the Santa Fe. But for a site that harnesses the efforts of volunteers all over the world, I think the goal should be to surpass what has been done before, not just to tie with Britannica. And even if Wikipedia's error rate someday beats Britannica's, under its current model Wikipedia can never have the key property that Britannica has, which is that you can cite it as an authoritative source without sounding silly.Citizendium's model of editor-approved articles, and editor approval of further edits to those articles, can help to achieve the benefits of collaboration, harnessing the efforts of volunteers, without falling into Wikipedia's traps. Assuming you can verify an editor's credentials (and we'll get to this in a minute), having an editor manage an article means two things: (a) you know the page wasn't vandalized in the last five minutes, and (b) you ought to be able to cite the work as a reference in a paper if your teacher isn't a total Luddite and you can explain to them how Citizendium works. Meanwhile, volunteers can still contribute without their own credentials being checked out; they can write as much as they want for an editor-approved article, as long as it's approved by the editor before going live.
There are still loopholes, of course. Currently Citizendium asks people to edit under their real name, but says that "we will use the honor principle to begin with", so anyone could claim to be a professor or a lunar astronaut. But the key words are "to begin with"; the difference between Wikipedia and Citizendium is that Citizendium views this as a loophole and not an intrinsic "community value", and loopholes can be fixed. To make the reliability as airtight as possible, I hope that Citizendium will eventually implement some sort of verification system, such as checking a professor's contact information on a Web page in the "faculty" section of an .edu Web server. I'm not instinctively thrilled by the thought of checking out volunteers' contact information, but it seems like the only way to achieve goals #3 and #4 above, so if it's as simple as sending a verification e-mail to an .edu address, that's a lot of gain for little effort. (Remember, this only has to be done for editors who sign off on articles, not for all volunteers. A non-editor volunteer could still ask to have their credentials checked out, so that they can be cited by their real name in the "end credits" of an article that lists volunteer contributors. But impersonation among regular volunteers is not likely to be a problem, since the editorial approval process ensures that only value-adding edits will be allowed, and it's unlikely that Alice would pretend to be Bob so that Bob can take all the glory of Alice's contributions to the project!)
Besides verifying authors' credentials, the one change that I hope Citizendium considers in the future is to give authors and editors credit at the top of each article -- or, for articles with many contributors, perhaps editors would be listed at the top and the "end credits" would list all contributors, on a separate page if necessary. This is because credited authorship for an article can help improve the article's usefulness in two ways -- the article can be cited as a reliable source, and the "name up in lights" factor rewards people for contributing more and better articles. Having authors listed only on the history page of an article, as they are in the current model, achieves the credibility benefit but not the "name up in lights" benefit. Larry Sanger suggested that having authors listed at the top of each article might put off readers from submitting edits -- if an article is perceived as being "owned", then others might feel like it's rude for them to change it. For me personally, this could go either way -- on the one hand, I might not realize that I was welcome to edit an article, but on the other hand, I think I might be more inclined to submit edits if I knew there was an editor in charge to keep someone else from frivolously overwriting my edits later. But in any case, to address this problem, each article could carry a banner at the top saying "Readers are encouraged to submit edits and other suggestions", and each paragraph could be accompanied by an "Edit" link, similar to Wikipedia (except that edits would go into a queue to be reviewed by the editor instead of going live). This would address the ownership-intimidation problem without taking away from the "name up in lights" factor. Sanger says that the Digital Universe Encyclopedia -- comprising the Encyclopedia of Earth and an Encyclopedia of the Cosmos, under development -- has plans to join with Citizendium and will use the credited-author model on their version of the site.
You might say that editors having their "name up in lights" would be an ego thing for editors, and I think you'd be right -- but I don't think this would be a bad thing, inasmuch as ego would motivate more people to become editors and do their best work. Perhaps I'd be wrong about this. Maybe a limited experiment could be carried out with two sites that are similar in every respect except that one allows editors and authors to take credit for their work, as might turn out to be the case with Citizendium and Encyclopedia of Earth. The point is that I don't think such a suggestion should be judged by whether it goes against the "spirit" of the project (as it certainly does in the case of Wikipedia!), but rather whether it helps to achieve the projects goals, such as goals #1 through #4 listed above.
There are still some problems that Citizendium's differences from Wikipedia won't solve. Many schools discourage citing Wikipedia not because it's written anonymously or because it contains errors, but because it's an encyclopedia. Yale's guidelines for citing Wikipedia state:
As an encyclopedia, Wikipedia is written for a common readership. But students in Yale courses are already consulting primary materials and learning from experts in the discipline. In this context, to rely on Wikipedia -- even when the material is accurate -- is to position your work as inexpert and immature.
Presumably many academics would have the same objections to a student citing Citizendium. I understand what these teachers mean, but I think this is a case of not thinking in terms of results. If the purpose of an assignment is to collect and present information, then any means of accomplishing that goal should be valid, including the easiest method of looking up the information in an encyclopedia. To make a student look beyond the encyclopedia, an assignment can simply require depth of research that goes beyond what the encyclopedia would provide. (Students, if you're worried that your teacher will take this to heart and make your assignments harder, just be happy that your teacher is hip enough to be reading this in the first place.) Some things are hard, but they should only be hard if they're intrinsically hard, not because you handicapped yourself with arbitrary rules.But there is another, more permanent problem -- even with verification of authors' credentials, how do we know that the information in Citizendium articles is accurate? How do we know the author didn't make a mistake, or lie? This gets into deeper issues because these problems exist no matter what source you're consulting. There are books in print that deny the Holocaust or the possibility of evolution, and they're printed on real paper, with ISBN numbers and everything. Some of them even make it into libraries. How skeptical should we be of we read in books? In January two advocacy groups presented a report to Congress in which many government scientists said they felt pressured by the Bush administration to downplay the global warming threat in their statements. Does that mean statements from government scientists are inherently suspect?
And almost anyone who has had more than two articles written about them, knows the feeling of reading the article and reacting, "Wow, I had no idea that I was a transgendered NRA member who volunteers with the Moonies!" The New York Times is hosting an article about me from 2000 claiming that I was fired from Microsoft, when I actually quit. I showed them a copy of my personnel file with "Voluntary resignation" printed on it, but they have still refused to change the article. (When I first wrote to the paper's "Public Editor" about the matter, created to restore "reader credibility" after the Jayson Blair scandal, they replied that they wouldn't change the error because it never appeared in the print version of the paper. Huh?) I put up my own webpage to tell my side of the story, but if you were a Wikipedia or Citizendium editor and you had conflicting information from different sources, who would you believe, the New York Times, or a Web site called PublicEditorMyAss.com?
And yet, I freely admit that even today, I would trust a fact from the New York Times more than a fact from Bob's Bait And Tackle Shop And Technology Blog. We instinctively trust sources because of their reputation; we figure that they must have gotten their reputation somehow. This is not a great algorithm for deciding trustworthiness, but it may be the best that we can do -- in a world where we can't verify every fact firsthand, what choice do we have but to rely on sources that have provided mostly-reliable information in the past? (Wikipedia vandals are able to hack this mental algorithm because we think of Wikipedia as "one source" with a high average reliability, when it's really comprised of many sources, some of whom are deliberately less reliable than others.)
So, I think the Citizendium model is a move in the right direction -- taking into account the limits of what we can know from third-party sources, and doing the best we can within those limits. The least we can do is to know who has signed off on the accuracy of an article, so we can factor that into our decision to trust it. Last month Citizendium released their first editor-approved article, a single article about Biology. It may not look like anything revolutionary right now, but the difference between that and the Wikipedia entry is that you can't change the title of the Citizendium article to LARRY SANGER IS A BUTT BRAIN HA HA. You have to go through an editor for that.
-
Why You & Yahoo Should Like This Human Rights Law
Regular contributor Bennett Haselton has written in to say that "The Global Online Freedom Act, introduced last year during a firestorm of controversy over American companies cooperating with totalitarian governments in China and elsewhere, was introduced this month as the Global Online Freedom Act of 2007. When Chris Smith (R-NJ) first introduced the law in 2006, Yahoo was under fire for recently turning over information to Chinese authorities that led to the arrest of a political dissident, Microsoft was attacked for removing pages from MSN Spaces China at the behest of the government, Google was being criticized for removing political sites from search results displayed to China, and Cisco was accused of helping to enable Chinese filtering of the Web. All four corporations testified at a February 2006 House hearing during which Representative Tom Lantos summed up the mood of many of his colleagues by telling the companies, "I do not understand how your corporate leadership sleeps at night." The companies protested that they had no choice but to comply with local Chinese laws, but that they were troubled by their own actions, and -- in a rarity for individual tech companies, much less for a chorus -- they all invited the U.S. government to play a bigger role, while being vague about what the role should be."GOFA would create a U.S.-government-designated list of "Internet restricting countries" and would in most cases prohibit U.S.-based companies from censoring content or turning over users' information to the governments of those countries. Do these companies want GOFA to pass? And is GOFA a good law? I think, yes and yes, but the answers are more complicated than they seem.
With American "collaboration" less in the news, GOFA made less of a splash when it was re-introduced this year, but it is still the subject of spirited debate. Reporters Without Borders, Amnesty International, and other human rights groups have already signed a statement supporting the July 2006 version of the bill (nearly identical the 2007 version). But blogger-journalist Rebecca MacKinnon argues that by creating a government-maintained list of "Internet censoring countries", the law falls short of calling for support of free speech in all countries (the initial list, for example, includes Iran and China, but leaves out notorious human rights violator and net-censor Saudi Arabia). Danny O'Brien of the EFF backs this position as well, and also argues the organization's long-standing position that "code is speech" and that filtering software should not be subject to export regulations that are proposed in the law.
I agree with MacKinnon that instead of using a list of "Internet restricting countries", we should require the same standards of U.S. companies wherever they do business, or at least, stop playing silly games like leaving Saudi Arabia off of a list of human rights violators because Bush is friends with the ruling family. I agree with the EFF that filtering software should be considered First-Amendment-protected speech like encryption software, and not be included on an export-prohibited "munitions" list. And for reasons listed below, I think that the law won't stop censoring countries from blocking any speech they want. But even with all of these qualifications, I think the law would be a step in the right direction, if only for the rules prohibiting companies from turning over users' personal information to the governments of countries like China and Iran. It's painful to give a pass to countries like Germany that also censor political speech, but I think that the situation is so much worse in places like China that we should do what we can in the short term. And for reasons I'll get into, I think that Microsoft, Yahoo, Google and Cisco are secretly hoping that a law like GOFA does get passed -- even if they can't come out and say so.
First, what the law does not do: There is still nothing to stop a U.S. company from blocking or removing legal, political content at the request of a foreign government. Section 204 says only that American content-hosting companies and content-filtering companies have to provide the U.S. government with a list of sites that have been removed or blocked at the behest of a censoring country.
Section 205 does say that U.S. companies may not block or remove sites that are operated by the U.S. government, or by any entity that receives grants from the International Broadcasting Bureau to help defeat foreign censorship. Presumably that would include Peacefire, at least during the periods when we're under contract to the IBB to develop the Circumventor software (but before you start calling me Hallibennett, I'm not working for the IBB right now, and it was my own idea to write this). So the American government, while requiring schools to block us in the U.S., would actually be helping to get us un-blocked in China and Iran! But Section 205 only says that a U.S. business may not block or shut down such sites. As far as I can tell, that means if the Cisco engineer on site in China sets up their routers for them, the Cisco engineer can't put VOANews.com on the block list. But then the Chinese official can walk across the room and add it to the list himself, can't he? Which is almost certainly what they'll do, since the routers are in their country.
So, I think the regulations against Internet blocking will be easy for foreign governments to ignore. But where the law could make a difference is in the prohibition against turning over users' personal data to law enforcement in censoring countries. Section 201 says that servers located in a censoring country cannot contain personally identifiable user information (so that the local police cannot simply storm in and seize the data). Section 202 says that American companies can only turn information over to law enforcement of a censoring country if the information is needed "for legitimate foreign law enforcement purposes as determined by the Department of Justice". MacKinnon has criticized this aspect of the law as well -- "If Americans don't want the DOJ to have access to their user information, why should anybody else?" Very true. But, even at the lowest point of public confidence in the Department of Justice, I think most people living outside of fortified compounds stocked with beef jerky and gold bullion, can agree that the U.S. DoJ has more integrity and legitimacy than the government of China, and that such a rule would mean fewer Chinese dissidents going to jail.
What do the affected U.S. companies think of the law? Microsoft, Yahoo, and Cisco did not respond to requests for comment. A Google PR person replied to say, "We welcome intiatives that expand access to information and protect the rights of users across the globe. At the same time, we remain concerned that legislation in this area can have unintended consequences, so we intend to study any such proposals closely, and work with proponents and others to reach the right outcome." When I replied that the Global Online Freedom Act had been proposed more than a year ago and had been online in its current form since June 2006, presumably enough time to "study such a proposal closely" and take a position on it, he said they would stick with that statement for now. (In his e-mail, he actually put quote marks around the company's statement, which I thought was a nice dry touch.)
But past statements from the respective companies have indicated they would be amenable to such a law. Bill Gates, never one to be shy about criticizing government regulation that he disagreed with, was asked in a February 2006 interview with the London Times, "Should the US government establish guidelines to regulate how internet companies deal with censorship in countries like China?" and answered, "I think something like the Foreign Corrupt Practices Act has been a resounding success in terms of very clearly outlining what companies can't do and other rich countries largely went along with that." At the February 2006 house hearings to discuss American companies' cooperation with overseas censors, representatives from all companies indicated that they actually wanted the government to play a bigger role -- they were vague about what such a role would be, but this was only a month after the first draft of the Global Online Freedom Act had been proposed, the only such law on the table at the time.
At first this might seem paradoxical -- why would companies seem amenable to, even supportive of, laws that would restrict what they can do? But it actually makes sense if you consider their negotiating position with the Chinese government. Currently, the Chinese censors can tell Microsoft, Yahoo, and Google that they either have to either play by the Chinese rules or get out, and the censors know that the companies will comply (without even necessarily feeling guilty about it -- the companies can always say that the Chinese people are better off with a censored version of their services than no access at all).
But if the companies' hands are tied by U.S. law, then they can basically present the Chinese government with a take-it-or-leave-it deal: You can use our e-mail and messenger and blog services, just know that our government won't let us turn over users' personal information if you ever want it. The Chinese censors are presumably coming from the point of view that they'd rather have a controlled Internet, but that it's more important to reap the economic benefits of having the Internet in their country, even if some control is lost (after all, if they didn't believe that, they wouldn't have connected to the Internet in the first place). Hence it's not likely that they'd throw out Yahoo Mail and Google search and MSN Messenger when so many users depend on these and use them for business as well as personal use. (Even if there are Chinese-made alternatives, there would be the huge cost of switching everyone over, and no longer being able to use the old tools to communicate with American companies.) So a law controlling the actions of U.S. companies would very probably allow them to keep doing business in censored countries, while giving them an excuse not to turn over users' data.
But, that might not work if it looks like the companies pushed too hard for the law themselves. If the Chinese see Yahoo fighting tooth and nail to pass a law that restricts what information Yahoo can hand over to China, the Chinese censors could take that as a slap in the face, and punish Yahoo for defying them even after the law is passed that prohibits Yahoo from cooperating. "Oh, you can't give us that information because of the law? This law right here that you lobbied for?"
So, when the general counsel of Yahoo says, "Ultimately, the greatest leverage lies with the U.S. government"; when the Vice President of Google tells Congress, "And certainly also, finally, there is a role for government. We do need your help, and you can help us"; when the associate general counsel of Microsoft testifies, "It is, therefore, the responsibility of governments, with the active leadership of the United States, to seek to reduce or reconcile these differences", I think what we're hearing are subtly encoded messages saying, "Pass this law, or something like it; we just can't look like we wanted it to pass." So, Congress should give them what they want, even if they can't ask for it directly. And at the same time they would be helping users in censored countries all around the world, before the next one gets sent to jail because an American company turned over their information.
-
Why You & Yahoo Should Like This Human Rights Law
Regular contributor Bennett Haselton has written in to say that "The Global Online Freedom Act, introduced last year during a firestorm of controversy over American companies cooperating with totalitarian governments in China and elsewhere, was introduced this month as the Global Online Freedom Act of 2007. When Chris Smith (R-NJ) first introduced the law in 2006, Yahoo was under fire for recently turning over information to Chinese authorities that led to the arrest of a political dissident, Microsoft was attacked for removing pages from MSN Spaces China at the behest of the government, Google was being criticized for removing political sites from search results displayed to China, and Cisco was accused of helping to enable Chinese filtering of the Web. All four corporations testified at a February 2006 House hearing during which Representative Tom Lantos summed up the mood of many of his colleagues by telling the companies, "I do not understand how your corporate leadership sleeps at night." The companies protested that they had no choice but to comply with local Chinese laws, but that they were troubled by their own actions, and -- in a rarity for individual tech companies, much less for a chorus -- they all invited the U.S. government to play a bigger role, while being vague about what the role should be."GOFA would create a U.S.-government-designated list of "Internet restricting countries" and would in most cases prohibit U.S.-based companies from censoring content or turning over users' information to the governments of those countries. Do these companies want GOFA to pass? And is GOFA a good law? I think, yes and yes, but the answers are more complicated than they seem.
With American "collaboration" less in the news, GOFA made less of a splash when it was re-introduced this year, but it is still the subject of spirited debate. Reporters Without Borders, Amnesty International, and other human rights groups have already signed a statement supporting the July 2006 version of the bill (nearly identical the 2007 version). But blogger-journalist Rebecca MacKinnon argues that by creating a government-maintained list of "Internet censoring countries", the law falls short of calling for support of free speech in all countries (the initial list, for example, includes Iran and China, but leaves out notorious human rights violator and net-censor Saudi Arabia). Danny O'Brien of the EFF backs this position as well, and also argues the organization's long-standing position that "code is speech" and that filtering software should not be subject to export regulations that are proposed in the law.
I agree with MacKinnon that instead of using a list of "Internet restricting countries", we should require the same standards of U.S. companies wherever they do business, or at least, stop playing silly games like leaving Saudi Arabia off of a list of human rights violators because Bush is friends with the ruling family. I agree with the EFF that filtering software should be considered First-Amendment-protected speech like encryption software, and not be included on an export-prohibited "munitions" list. And for reasons listed below, I think that the law won't stop censoring countries from blocking any speech they want. But even with all of these qualifications, I think the law would be a step in the right direction, if only for the rules prohibiting companies from turning over users' personal information to the governments of countries like China and Iran. It's painful to give a pass to countries like Germany that also censor political speech, but I think that the situation is so much worse in places like China that we should do what we can in the short term. And for reasons I'll get into, I think that Microsoft, Yahoo, Google and Cisco are secretly hoping that a law like GOFA does get passed -- even if they can't come out and say so.
First, what the law does not do: There is still nothing to stop a U.S. company from blocking or removing legal, political content at the request of a foreign government. Section 204 says only that American content-hosting companies and content-filtering companies have to provide the U.S. government with a list of sites that have been removed or blocked at the behest of a censoring country.
Section 205 does say that U.S. companies may not block or remove sites that are operated by the U.S. government, or by any entity that receives grants from the International Broadcasting Bureau to help defeat foreign censorship. Presumably that would include Peacefire, at least during the periods when we're under contract to the IBB to develop the Circumventor software (but before you start calling me Hallibennett, I'm not working for the IBB right now, and it was my own idea to write this). So the American government, while requiring schools to block us in the U.S., would actually be helping to get us un-blocked in China and Iran! But Section 205 only says that a U.S. business may not block or shut down such sites. As far as I can tell, that means if the Cisco engineer on site in China sets up their routers for them, the Cisco engineer can't put VOANews.com on the block list. But then the Chinese official can walk across the room and add it to the list himself, can't he? Which is almost certainly what they'll do, since the routers are in their country.
So, I think the regulations against Internet blocking will be easy for foreign governments to ignore. But where the law could make a difference is in the prohibition against turning over users' personal data to law enforcement in censoring countries. Section 201 says that servers located in a censoring country cannot contain personally identifiable user information (so that the local police cannot simply storm in and seize the data). Section 202 says that American companies can only turn information over to law enforcement of a censoring country if the information is needed "for legitimate foreign law enforcement purposes as determined by the Department of Justice". MacKinnon has criticized this aspect of the law as well -- "If Americans don't want the DOJ to have access to their user information, why should anybody else?" Very true. But, even at the lowest point of public confidence in the Department of Justice, I think most people living outside of fortified compounds stocked with beef jerky and gold bullion, can agree that the U.S. DoJ has more integrity and legitimacy than the government of China, and that such a rule would mean fewer Chinese dissidents going to jail.
What do the affected U.S. companies think of the law? Microsoft, Yahoo, and Cisco did not respond to requests for comment. A Google PR person replied to say, "We welcome intiatives that expand access to information and protect the rights of users across the globe. At the same time, we remain concerned that legislation in this area can have unintended consequences, so we intend to study any such proposals closely, and work with proponents and others to reach the right outcome." When I replied that the Global Online Freedom Act had been proposed more than a year ago and had been online in its current form since June 2006, presumably enough time to "study such a proposal closely" and take a position on it, he said they would stick with that statement for now. (In his e-mail, he actually put quote marks around the company's statement, which I thought was a nice dry touch.)
But past statements from the respective companies have indicated they would be amenable to such a law. Bill Gates, never one to be shy about criticizing government regulation that he disagreed with, was asked in a February 2006 interview with the London Times, "Should the US government establish guidelines to regulate how internet companies deal with censorship in countries like China?" and answered, "I think something like the Foreign Corrupt Practices Act has been a resounding success in terms of very clearly outlining what companies can't do and other rich countries largely went along with that." At the February 2006 house hearings to discuss American companies' cooperation with overseas censors, representatives from all companies indicated that they actually wanted the government to play a bigger role -- they were vague about what such a role would be, but this was only a month after the first draft of the Global Online Freedom Act had been proposed, the only such law on the table at the time.
At first this might seem paradoxical -- why would companies seem amenable to, even supportive of, laws that would restrict what they can do? But it actually makes sense if you consider their negotiating position with the Chinese government. Currently, the Chinese censors can tell Microsoft, Yahoo, and Google that they either have to either play by the Chinese rules or get out, and the censors know that the companies will comply (without even necessarily feeling guilty about it -- the companies can always say that the Chinese people are better off with a censored version of their services than no access at all).
But if the companies' hands are tied by U.S. law, then they can basically present the Chinese government with a take-it-or-leave-it deal: You can use our e-mail and messenger and blog services, just know that our government won't let us turn over users' personal information if you ever want it. The Chinese censors are presumably coming from the point of view that they'd rather have a controlled Internet, but that it's more important to reap the economic benefits of having the Internet in their country, even if some control is lost (after all, if they didn't believe that, they wouldn't have connected to the Internet in the first place). Hence it's not likely that they'd throw out Yahoo Mail and Google search and MSN Messenger when so many users depend on these and use them for business as well as personal use. (Even if there are Chinese-made alternatives, there would be the huge cost of switching everyone over, and no longer being able to use the old tools to communicate with American companies.) So a law controlling the actions of U.S. companies would very probably allow them to keep doing business in censored countries, while giving them an excuse not to turn over users' data.
But, that might not work if it looks like the companies pushed too hard for the law themselves. If the Chinese see Yahoo fighting tooth and nail to pass a law that restricts what information Yahoo can hand over to China, the Chinese censors could take that as a slap in the face, and punish Yahoo for defying them even after the law is passed that prohibits Yahoo from cooperating. "Oh, you can't give us that information because of the law? This law right here that you lobbied for?"
So, when the general counsel of Yahoo says, "Ultimately, the greatest leverage lies with the U.S. government"; when the Vice President of Google tells Congress, "And certainly also, finally, there is a role for government. We do need your help, and you can help us"; when the associate general counsel of Microsoft testifies, "It is, therefore, the responsibility of governments, with the active leadership of the United States, to seek to reduce or reconcile these differences", I think what we're hearing are subtly encoded messages saying, "Pass this law, or something like it; we just can't look like we wanted it to pass." So, Congress should give them what they want, even if they can't ask for it directly. And at the same time they would be helping users in censored countries all around the world, before the next one gets sent to jail because an American company turned over their information.
-
Are DMCA Abuses a Temporary or Permanent Problem?
Regular Slashdot contributor Bennett Haselton wrote in with a story about the DMCA. He starts "On January 16, a man named Guntram Graef who invoked the Digital Millennium Copyright Act to ask YouTube to remove a video of giant penises attacking his wife's avatar/character in the virtual community "Second Life", retracted the claim and stated that he now believes the video was not a copyright violation. (He had sent similar notices to BoingBoing and the Sydney Morning Herald just for posting screen shots of the video.) His statements in a C-Net interview suggest that he didn't mean to alienate the anti-censorship community and was probably angry over what he saw as a sexually explicit attack on his wife. But the event sparked renewed debate over the DMCA and what constitutes abuse of it. I sympathize with Graef and I admire him for admitting an error, but I still think the incident shows why the DMCA is a bad law." Hit that link below to read the rest of his story.The DMCA is known mainly for its two most controversial provisions: the ban on technology to circumvent copyright restrictions, and the procedures by which ISPs must respond to "take down" notices if a third party claims that one of the ISP's users is violating their copyright. The first of these, I am opposed to in principle; the second, I am not opposed to in principle but I think is too easy to abuse in practice -- because I think incidents like the Graef case and my own limited court experience in related areas has suggested that the protections against DMCA-type abuses are very weak.
First, I'm against the anti-circumvention provision in principle because I agree with the position espoused by the EFF that computer code is protected under the First Amendment, even if some uses of that computer code may be illegal. After all, at one point a U.S. court even ruled that a manual for carrying out murders as a hit man was protected speech! That ruling was overturned on appeal, and the case was settled out of court before a final decision was ever reached, but still -- given that a handbook for killing people was considered free speech by at least one court, it's a bit of a stretch to think that a DVD-copying program should be given less protection. Just because X is illegal does not mean that tools or instructions for doing X should also be illegal.
With regard to the second provision, I'm not against requiring ISPs to take down infringing material on receipt of a notice from the copyright holder. But in practice there are two avenues for abuse here: (a) the party sending the take down notice can make statements that are not technically false, but which have the effect of persuading the ISP to take the material down, or (b) the party sending the take down notice can simply lie -- because the truth is that in too many cases, false statements made "under penalty of perjury" are not prosecuted, or even noticed, by the courts.
The EFF has already done a good job documenting abuses under the DMCA, and I'm not going to repeat all of that here. My argument is that these are not just temporary problems with a relatively new law, but rather that the abuses are the result of realities that won't change any time soon: ISPs being too busy to look closely at every complaint, and courts being too busy to go after everyone who violates court rules to get what they want. And thus it does no good to say that the DMCA would be fine if only enforcement actually got done properly instead of the ham-handed way it's been carried out so far, because that's not going to happen.
As I said, I think that if you have a bona fide case against a party, there's nothing wrong with taking action against them that would otherwise be considered a violation of their privacy and other rights. I've never sent a DMCA take down notice myself, but I've been involved in court cases in which I asked the judge to sign an order requiring a third party to turn over information about someone that was pertinent to the case. I don't consider that an abuse of the system, if the information you're after is relevant.
I realize this may separate me from some fellow privacy advocates, and some of the things I've done may make them uncomfortable. In one case, I had invited a girl to a charity luncheon where the tickets were $100 apiece, and when she showed up she had "forgotten her checkbook" and needed to borrow the money... Now, don't get ahead of me... Later, in what will not come as a huge spoiler to my fellow male Seattle residents, she apparently decided that, being a non-overweight, non-single-Mom, non-sexually-repressed girl in a city full of rich single guys, she was under no obligation to pay me back, and said, "Go ahead and sue me". Anyone who knows about my sideline taking spammers to court would tell you, it is not a terrifically smart move to say to me, "Go ahead and sue me". So, since I was going to be at the courthouse for an upcoming case against a spammer, I figured, why not, and filled out a Small Claims form with the defendant's address listed as "to be determined", since all I had was her cell phone number. Then I asked the judge to sign an order asking T-Mobile to give me the rest of her information so I could serve the papers on her. The judge signed it, I mailed it off to T-Mobile, and three weeks later T-Mobile sent me a letter containing her address, where I had the papers served. Most people don't know it's possible to do this just in a case where someone owes you $100 and all you have is a phone number, but that's just because a lawyer would never bother with such a small case, and most non-lawyers don't know the option exists -- and of course, it also depends on the judge, who may or may not sign the order.
(In that vein, people always ask me, is that sort of thing really worth the time? In this case, since I was going to be at the courthouse anyway, the extra time to write the motion, get it signed, and mail it off, was less than 30 minutes. But I was mainly curious about whether or not it could be done, and how much privacy protection there really is under the law, and knowing that was worth more to me than the $100 anyway.)
So I don't think it's unethical to request such information if you have a genuine case against a party. But while I don't think that what I did constitutes abuse of the system, I think it clearly shows how the system could be abused. Nobody checked my ID when I filed the case or asked the judge to sign the subpoena; I could have been anybody, and I could have disappeared once I had the information. (I had T-Mobile mail it to my address, but I could have just as easily had them mail it to the court, and then gone down and asked to look at the court file.) DMCA opponents should be aware that even without the DMCA, privacy protections are not as great as most people probably think they are.
As a result, I'm especially nervous about laws that enable abuse based on copyright assertions, because almost all of the legal threats we've ever received at Peacefire were based on what I considered to be bogus "copyright" claims. In 1997 we published a program that you could run on any computer with CYBERsitter blocking software installed, and it would decrypt the file that stored CYBERsitter's "secret" blocked-site list, and print it out in plain text. The CEO of CYBERsitter claimed that we were "violating every intellectual property law ever written" and sent threatening notices to our ISP demanding that they remove the program. I argued that every byte of the decryption program was our original work, so it didn't violate their copyright. In fact, it didn't even enable violations of their copyright, because it didn't make it any easier for someone to distribute illegal copies of their program, and I also said the decryption program served a worthwhile purpose by allowing customers or potential customers to see what the program really blocked. (Although to me, the enabling issue and the "worthwhile purpose" issue were secondary to the primary point, that original works of computer code should be protected by the First Amendment.) Fortunately our ISP stood their ground, but if the DMCA had existed back then, CYBERsitter could have invoked it, and possibly the extra pressure might have caused our ISP to back down. (Blocked-site-decryption programs were originally exempt from the DMCA as a result of the decision of the Copyright Office, but that exemption was revoked in 2006 because nobody had written a new decryption program in three years.)
So that was an example of how a company could intimidate an ISP into taking down material, without technically lying about the situation, but tacking on the words "copyright violation" and hoping the ISP would capitulate. What about cases where the sender of a DMCA take down notice just lies?
The Dutch activist group Bits Of Freedom conducted an experiment in 2004, in which they signed up with 10 different ISPs and posted a copy of a work that was clearly labeled with a notice that the author had died 100 years ago and the copyright had expired. Then they sent fake "complaints" to all 10 ISPs from an anonymous Hotmail address. 7 of the 10 ISPs removed the content immediately, and one even replied to give the personal details of the account holder, without being asked to do so. So completely fictitious complaints do apparently work. The DMCA does more protection than that because it requires the complainer to make a copyright claim "under penalty of perjury". But how much assurance does that really provide?
No one has yet tried to get our site shut down with a copyright claim or other accusation that was simply made up out of whole cloth. But my experiences in other areas have left me without much confidence in statements that are made "under penalty of perjury". The times I've been to court against spammers, I usually get to watch a few other Small Claims cases being tried. Probably at least once every time that I've been there, it's come to light that some party in a case said something that they almost certainly knew was not true, and I've never seen a judge do anything about it -- and court employees who have been there much longer have said they've never seen it happen either. (Judges are far more likely to get upset about people speaking out of turn. It's OK to lie, as long as you do it while the judge isn't talking!) It's true that Small Claims court is for resolving small matters, but lying under oath in Small Claims court is still a felony, punishable at least in theory by up to 10 years in jail. (And in any case, lawyers have told me that even in higher-level courtrooms, most false statements don't get anyone in big trouble. High-profile cases like Martha Stewart are the exception.) I don't think that everyone who lies under oath should go to the big house for 10 years. But I have no faith in the DMCA just because it requires accusatory statements to be made "under penalty of perjury", when judges usually let false statements under oath go completely unnoticed.
I doubt that a lawyer would risk their career and even their freedom to make up a completely fraudulent DMCA claim against us, such as claiming a page on our site was a ripoff of something originally produced by their client. But I don't think it's out of the realm if possibility that a lawyer would claim that, for example, a parody of one of their logos that appeared on our site, was a "copyright violation" -- even though the company would almost certainly be advised by their lawyer that such parodies are protected speech, which means their statement would constitute perjury, but it would probably never be punished.
The low point of my own confidence in the enforcement of anti-perjury laws, came when I sued a spammer who appeared in court and claimed that he had absolutely no knowledge of the spam being sent, and had never accepted any orders for spamming of any kind, while the judge, who appeared to hate anti-spam cases even more than most judges did, kept haranguing me for suing a clearly "innocent" person. I then played a recording of a conversation that I had with the spammer over the phone, pretending to be an interested customer (with a disclaimer played at the beginning of the call saying that it could be recorded, in order to make the taping legal), in which he said, among other things:
"I mean, we have all their information to back up any email we send them. If we have their ISP information, we can prove that they've given it out, because you can't get someone's ISP unless they've given it to somebody." [sic -- he meant "get someone's e-mail address", although the statement is still wrong]
"Do you already have your creatives and everything? So I've just got to upload what you have and just blast it out?" [note: "creatives" are copies of ads that sent out for you by advertisers and spammers]
"It's a United-States-based company but they pump everything through China and then it comes back to the United States."
The judge appeared very flustered at that point and started accusing me of "entrapment" (which was backwards -- I'd never heard of the spammer until he spammed me first, and then I called him afterwards, just to get evidence that he was in the spamming business in case he showed up in court and denied it). Since she claimed it was entrapment, I still lost and the spammer walked out home-free, without the judge ever even commenting on the questionable veracity of the statements he had made at the beginning. And that is all the protection that exists in the real world against people making false statements "under penalty of perjury".
The point is that when reading the wording of a proposed law, there's a temptation to think that the scenario described is exactly how the law will play out when it's enforced (see the "Alice, Bob and Charlie" scenario in the Wikipedia entry on the relevant section of the DMCA), and that anyone who deviates from the rules will be punished. But my narrow experience in court, in an area unrelated to the DMCA, taught me some things that several lawyers, with sad smiles, have confirmed to be true throughout the law: (a) judges will do what they want; (b) even if judges do sincerely want to follow the law, they're unlikely to agree on what it says; and (c) courts don't have the will or the time to chase down every person who violates the rules.
Don't judge a law by what it says will happen. Judge it by how it will play out if more than half of the steps in the process get screwed up. Guntram Graef apparently wasn't even trying to do anything dishonest when he got a video removed from YouTube on the basis of copyright claims that turned out not to be valid. Imagine how much abuse is possible when you're gaming the system on purpose.
-
Behind the Magic of Anti-Censorship Software
Regular Slashdot contributor Bennett Haselton writes in to say "The December 1st release of Psiphon has sparked renewed interest in the various software programs that can help circumvent Internet censorship in China, Iran, and other censored countries. (Some of this interest undoubtedly being motivated by the fact that many of these programs also work for getting around blocking software at work or school.) Have you ever wanted to understand the science behind these programs, the way that mathematicians and codebreakers understand the magic behind PGP? If you loved the mental workout of reading "Applied Cryptography", have you ever wanted a tutorial to do the same for Psiphon and Tor and other anti-censorship programs?" The rest of his editorial follows.Well, here's a primer, but you might be disappointed. Like making the Statue of Liberty disappear, it doesn't sound very cool once you know how it's done; the truth is that most anti-censorship programs, including mine, only work because the censors are not trying very hard.
(Note that I am going to be talking about ways that certain anti-censorship programs can be defeated. I don't believe that this is giving much help to censors, because these are obvious weaknesses that would occur to anyone who knows how the programs work. For reasons I'll get into at the end, I don't think these weaknesses actually make much difference.)
Basically, all anti-censorship programs fall into two categories: those that require you to have a helper outside of the censored country, and those that don't.
Take Psiphon. To use Psiphon, someone in a non-censored country has to install it on their home computer, which turns their computer into a Web server with an interface similar to Anonymouse.org, where you type in the URL of the page you want to view and it fetches it for you. The difference, of course, is that Anonymouse.org is widely known and blocked by any self-respecting Internet filtering system, while your newly created Psiphon URL pointing to your home computer is not blocked anywhere, yet. So if you set up a Psiphon URL on your computer in the U.S. and e-mail it to your friend in China, your friend can use it to surf wherever they want. (Note that this also has the desirable property that the person in China doesn't have to install any software, so they can use the URL even from a cybercafe computer with restricted user permissions.) The hurdle, of course, is that the person in China has to have a contact outside the country to help them. This is not a huge barrier for many Chinese, but it still means the program doesn't have the instant gratification property of something that you turn on and it just works.
Peacefire, by the way, had released the Circumventor program in 2003 which did essentially the same thing. (And the Circumventor was itself really just a wizard for installing a Web server with James Marshall's CGIProxy script, which deserves most of the credit, although the Circumventor did help bring it "to the masses", since most users don't have the ability to set up an SSL-enabled Web server themselves.) Psiphon made some improvements, namely:
- Ability to create password-protected accounts to restrict the URL to certain users.
- Smaller download (although it may not matter much since only broadband users would be installing it anyway).
- Ability to run on Linux. (Circumventor only works on Windows, although you can install CGIProxy on a Linux webserver if you know how.)
- A wizard to help users forward incoming connections on their router and enter exceptions in software firewalls to make the software work. (If they want to. No tweaking people's firewall settings without asking them!)
- Slightly harder to block, due to some strategies such as using a different SSL certificate for each install (Psiphon uses the same one each time).
And both programs fall victim to the same attacks, although as far as I know, none of these have been implemented in practice:
- Blocking sites whose SSL certificates do not match the site hostname (easier for a censoring proxy server like the ones used in the Middle East, than for an IP firewall like the Great Firewall of China).
- Blocking outgoing Web connections to residential IP address ranges like Comcast.
But basically, they're the same program -- so the difference in press coverage has been illustrative of how much context matters to reporters. Psiphon is the "politically correct" version -- they've played down the fact that it can be used to get around blocking software in schools and played up the fact that it can be used to beat the censors in China and Iran, and the press coverage has focused exclusively on that human rights aspect. The Circumventor was also written to help foreign victims of censorship, and articles have been written about its uses for that purpose, but I've also been unapologetically promoting its use to get around blocking software at home and in school, as part of an advocacy for greater civil rights for people under 18. (Also because the more installations there are in the U.S., the more it helps users abroad.) As a result, some of the TV news pieces about it have used such ominous music and lighting that they practically looked like recycled footage from "To Catch a Predator". Of course, Psiphon can be used for exactly the same thing. (I also emailed some of the reporters who recently wrote about Psiphon, to tell them about Circumventor; so far, I haven't heard back from any of them, but I doubt they're being politically correct this time, I think they're just not thrilled that C-Net scooped them by three years and seven months.)
So, Psiphon and Circumventor fall in the first category -- programs that only work if you've got a contact outside the censored country to help you. In the second category is Tor, which was originally written to provide mathematically secure anonymity, but had the nice property that it could be used to get around the Great Firewall of China as well. With your browser in China using Tor as a proxy, packets are routed to other Tor nodes outside the country, which connect you with any blocked Web site that you want to see. Best of all, you just install it on a machine in China, and presto, it works, no nagging your expat cousin in the U.S. to install something on their computer to help you. Dynamic Internet Technologies, run by Chinese dissident Bill Xia in North Carolina, runs another service that works "out of the box" -- you send an instant-message to one of the DIT screen names, and it replies with a list of currently running Web proxies. (Bill has asked me not to publicize the actual screen names that perform this service, because it's intended only for Chinese users. I think that's a case of "security through obscurity", but I respect his wishes.)
Unfortunately, all such "instant gratification" solutions have the same basic weakness, which by a simple argument can be extended even to hypothetical future programs in the same category. In the case of a program like Tor, the censor only has to install the software, look at what IP addresses the software connects to when it bootstraps itself, and add those IP addresses to the blacklist. Even if the software chooses at random from multiple IP addresses to bootstrap to, the censor can still obtain all of them by repeatedly re-installing the software (possibly wiping the machine each time so the software can't tell that it's been installed before). No matter how you slice it, if Alice the legitimate user and Bob the censor download the program on the same day, Bob can make the program not work for Alice if he updates the blacklist quickly enough. He doesn't even have to reverse-engineer the software, he just has to use a network sniffer to see where it connects to. (For DIT's proxy-by-instant-message system, the censor can instant-message the screen name repeatedly, from different accounts, until they've collected and blocked all the available proxies; this would be analogous to re-installing Tor repeatedly and seeing what IPs it connects to.)
Peacefire has produced other approach which is a simple, obvious idea, and it was quite by accident that we found out it slips through the cracks of the seemingly "unsolvable" problem with instant-gratification outlined above. Like the other solutions, it works only as long as the censors are fairly lazy, but they are, and it does. About 30,000 people have signed up through a form on our site to be notified each time we create a new Circumventor site and mail it out, every 3 or 4 days. Agents of the blocking companies have joined the list too, of course, but we mail different sites to different subsets of the list. Now, an attack analogous to the attacks listed in the previous paragraph, would be for the censors to join under many different accounts, and then block any site that gets mailed to any of those accounts. But the catch is that when an address joins the list, a new site doesn't get mailed to that address until some random time in the future. So the censor has to check all of the fake Hotmail accounts that they've created, over and over, if they want to block all of the new sites as soon as they're released. Hardly impossible, but the censor can no longer use the instantaneous approach of: (1) enter the system / join the list / install the software; (2) see where it connects to and block those points of access; (3) repeat. (If we instantly e-mailed a randomly selected site to each new signup, then this attack would work.) By going from instant gratification to almost-instant-gratification, you change one of the conditions for the theorem stated in the previous paragraph, so that it no longer holds true. Still, like Tor and the DIT system, it could be blocked with a moderate amount of effort.
The Tor protocol, by the way, has been the subject of a great deal of sophisticated mathematical analysis, really brainy stuff that is beyond the scope of this article. But it's important to understand that that analysis focuses on the security of the Tor protocol for achieving anonymity. For anonymity, the protocol is very strong; for routing around censorship, it's fairly straightforward to defeat. That's not at all a criticism of the Tor developers; Tor was designed to achieve anonymity, and just turned out to work for beating censorship as well -- but only, of course, as long as the censors aren't making much effort to block it.
Which all leads to the obvious question: Why have the censors not bothered?
Nobody knows for sure, but I fear the answer is that the Chinese government and other censors know that the greatest weapon in their arsenal is not IP blocking, or keyword filtering, or even the threat of arrest. It's just apathy. The Chinese censors know what we anti-censorware developers in the free world keep forgetting: that most Chinese are not liberty-minded Jeffersonians chomping at the bit under the oppressive yoke of their government and waiting to be freed by circumvention software. As Michael Chase and James Mulvenon of the RAND Corporation put it in their report on Internet usage by Chinese dissidents, You've Got Dissent!: "[A]lthough some peer-to-peer applications... are designed specifically to combat censorship on the Internet and address privacy concerns, most Chinese Internet users are undoubtedly more interested in using peer-to-peer applications for entertainment purposes such as downloading MP3 music files." The censors know what Netscape knew when they fought tooth and nail against Microsoft including Internet Explorer on the desktop of every Windows machine: defaults matter. It doesn't matter that users can go to Netscape's site and download their browser, and it doesn't matter that users can access a banned site by installing a cool p2p program. Most people just don't.
When I first started working on the Circumventor, I assumed that since the Chinese Internet censorship bureau reportedly employed about 30,000 people, surely if they were already spending that much effort and money, they'd throw plenty of resources at defeating any new anti-censorship program, so the Circumventor would have to be able to withstand any such attack. But I was wrong. According to the RAND corporation paper, the censors have been quite busy, for example, policing political forums for dissident postings that other users might casually run into. But they apparently assume -- correctly, it seems -- that content doesn't pose much of a threat if users have to go out of their way and download a program to access it. And if the user has to have a friend outside the country to help them, then forget it.
This is not to downplay the enormous good that programs like Tor, Circumventor and Psiphon can do in bringing free speech to the people in censored countries who want it. But it's easy to forget that those often do not comprise a large part of the population.
One of the biggest disappointments for me came in May 2005 when I was looking for ways to get around the word filter on MSN China's blogging service. Microsoft, apparently acting on public relations advice from Lex Luthor, had decided to filter the words "freedom", "democracy", and "Taiwan independence" from the titles of blogs on MSN China. (I know, I know, they have to comply with Chinese laws to do business there. But I don't think the Chinese have actually outlawed the word "democracy".) Eventually I did find a loophole, so I searched on MSN for some Chinese blogs published by expatriates to ask them to help test the workaround for me. With a few exceptions, most of the bloggers were rather hostile, saying that they supported their government's efforts to censor the Internet and to stamp out Falun Gong as a dangerous "cult". (These were expats living in the U.S., so presumably they were not worried about the Chinese government sending a tank across the Pacific to run them over if they criticized the ruling party. Even if they thought they had to watch what they said because they might someday return to China, or because they still had family there, surely it would have been easier just to ignore me; the hostility that I encountered sounded genuine.) The moral is, no matter how much your movement believes in its efforts to help oppressed people, you can't just assume you'll be greeted as liberators (ahem).
So now you know most of what there is to know about the state of the art in anti-censorship software. It's just that there is less to understand than the hype originally suggests -- the programs aren't really secure, but they work because the censors aren't really trying. And there aren't any cool mathematical formulas that you can impress your friends with -- for that, you'll still have to go back to Applied Cryptography. It's a lot less impressive to be the Bruce Schneier of circumvention algorithms than it is to be the real Bruce Schneier.
-
Behind the Magic of Anti-Censorship Software
Regular Slashdot contributor Bennett Haselton writes in to say "The December 1st release of Psiphon has sparked renewed interest in the various software programs that can help circumvent Internet censorship in China, Iran, and other censored countries. (Some of this interest undoubtedly being motivated by the fact that many of these programs also work for getting around blocking software at work or school.) Have you ever wanted to understand the science behind these programs, the way that mathematicians and codebreakers understand the magic behind PGP? If you loved the mental workout of reading "Applied Cryptography", have you ever wanted a tutorial to do the same for Psiphon and Tor and other anti-censorship programs?" The rest of his editorial follows.Well, here's a primer, but you might be disappointed. Like making the Statue of Liberty disappear, it doesn't sound very cool once you know how it's done; the truth is that most anti-censorship programs, including mine, only work because the censors are not trying very hard.
(Note that I am going to be talking about ways that certain anti-censorship programs can be defeated. I don't believe that this is giving much help to censors, because these are obvious weaknesses that would occur to anyone who knows how the programs work. For reasons I'll get into at the end, I don't think these weaknesses actually make much difference.)
Basically, all anti-censorship programs fall into two categories: those that require you to have a helper outside of the censored country, and those that don't.
Take Psiphon. To use Psiphon, someone in a non-censored country has to install it on their home computer, which turns their computer into a Web server with an interface similar to Anonymouse.org, where you type in the URL of the page you want to view and it fetches it for you. The difference, of course, is that Anonymouse.org is widely known and blocked by any self-respecting Internet filtering system, while your newly created Psiphon URL pointing to your home computer is not blocked anywhere, yet. So if you set up a Psiphon URL on your computer in the U.S. and e-mail it to your friend in China, your friend can use it to surf wherever they want. (Note that this also has the desirable property that the person in China doesn't have to install any software, so they can use the URL even from a cybercafe computer with restricted user permissions.) The hurdle, of course, is that the person in China has to have a contact outside the country to help them. This is not a huge barrier for many Chinese, but it still means the program doesn't have the instant gratification property of something that you turn on and it just works.
Peacefire, by the way, had released the Circumventor program in 2003 which did essentially the same thing. (And the Circumventor was itself really just a wizard for installing a Web server with James Marshall's CGIProxy script, which deserves most of the credit, although the Circumventor did help bring it "to the masses", since most users don't have the ability to set up an SSL-enabled Web server themselves.) Psiphon made some improvements, namely:
- Ability to create password-protected accounts to restrict the URL to certain users.
- Smaller download (although it may not matter much since only broadband users would be installing it anyway).
- Ability to run on Linux. (Circumventor only works on Windows, although you can install CGIProxy on a Linux webserver if you know how.)
- A wizard to help users forward incoming connections on their router and enter exceptions in software firewalls to make the software work. (If they want to. No tweaking people's firewall settings without asking them!)
- Slightly harder to block, due to some strategies such as using a different SSL certificate for each install (Psiphon uses the same one each time).
And both programs fall victim to the same attacks, although as far as I know, none of these have been implemented in practice:
- Blocking sites whose SSL certificates do not match the site hostname (easier for a censoring proxy server like the ones used in the Middle East, than for an IP firewall like the Great Firewall of China).
- Blocking outgoing Web connections to residential IP address ranges like Comcast.
But basically, they're the same program -- so the difference in press coverage has been illustrative of how much context matters to reporters. Psiphon is the "politically correct" version -- they've played down the fact that it can be used to get around blocking software in schools and played up the fact that it can be used to beat the censors in China and Iran, and the press coverage has focused exclusively on that human rights aspect. The Circumventor was also written to help foreign victims of censorship, and articles have been written about its uses for that purpose, but I've also been unapologetically promoting its use to get around blocking software at home and in school, as part of an advocacy for greater civil rights for people under 18. (Also because the more installations there are in the U.S., the more it helps users abroad.) As a result, some of the TV news pieces about it have used such ominous music and lighting that they practically looked like recycled footage from "To Catch a Predator". Of course, Psiphon can be used for exactly the same thing. (I also emailed some of the reporters who recently wrote about Psiphon, to tell them about Circumventor; so far, I haven't heard back from any of them, but I doubt they're being politically correct this time, I think they're just not thrilled that C-Net scooped them by three years and seven months.)
So, Psiphon and Circumventor fall in the first category -- programs that only work if you've got a contact outside the censored country to help you. In the second category is Tor, which was originally written to provide mathematically secure anonymity, but had the nice property that it could be used to get around the Great Firewall of China as well. With your browser in China using Tor as a proxy, packets are routed to other Tor nodes outside the country, which connect you with any blocked Web site that you want to see. Best of all, you just install it on a machine in China, and presto, it works, no nagging your expat cousin in the U.S. to install something on their computer to help you. Dynamic Internet Technologies, run by Chinese dissident Bill Xia in North Carolina, runs another service that works "out of the box" -- you send an instant-message to one of the DIT screen names, and it replies with a list of currently running Web proxies. (Bill has asked me not to publicize the actual screen names that perform this service, because it's intended only for Chinese users. I think that's a case of "security through obscurity", but I respect his wishes.)
Unfortunately, all such "instant gratification" solutions have the same basic weakness, which by a simple argument can be extended even to hypothetical future programs in the same category. In the case of a program like Tor, the censor only has to install the software, look at what IP addresses the software connects to when it bootstraps itself, and add those IP addresses to the blacklist. Even if the software chooses at random from multiple IP addresses to bootstrap to, the censor can still obtain all of them by repeatedly re-installing the software (possibly wiping the machine each time so the software can't tell that it's been installed before). No matter how you slice it, if Alice the legitimate user and Bob the censor download the program on the same day, Bob can make the program not work for Alice if he updates the blacklist quickly enough. He doesn't even have to reverse-engineer the software, he just has to use a network sniffer to see where it connects to. (For DIT's proxy-by-instant-message system, the censor can instant-message the screen name repeatedly, from different accounts, until they've collected and blocked all the available proxies; this would be analogous to re-installing Tor repeatedly and seeing what IPs it connects to.)
Peacefire has produced other approach which is a simple, obvious idea, and it was quite by accident that we found out it slips through the cracks of the seemingly "unsolvable" problem with instant-gratification outlined above. Like the other solutions, it works only as long as the censors are fairly lazy, but they are, and it does. About 30,000 people have signed up through a form on our site to be notified each time we create a new Circumventor site and mail it out, every 3 or 4 days. Agents of the blocking companies have joined the list too, of course, but we mail different sites to different subsets of the list. Now, an attack analogous to the attacks listed in the previous paragraph, would be for the censors to join under many different accounts, and then block any site that gets mailed to any of those accounts. But the catch is that when an address joins the list, a new site doesn't get mailed to that address until some random time in the future. So the censor has to check all of the fake Hotmail accounts that they've created, over and over, if they want to block all of the new sites as soon as they're released. Hardly impossible, but the censor can no longer use the instantaneous approach of: (1) enter the system / join the list / install the software; (2) see where it connects to and block those points of access; (3) repeat. (If we instantly e-mailed a randomly selected site to each new signup, then this attack would work.) By going from instant gratification to almost-instant-gratification, you change one of the conditions for the theorem stated in the previous paragraph, so that it no longer holds true. Still, like Tor and the DIT system, it could be blocked with a moderate amount of effort.
The Tor protocol, by the way, has been the subject of a great deal of sophisticated mathematical analysis, really brainy stuff that is beyond the scope of this article. But it's important to understand that that analysis focuses on the security of the Tor protocol for achieving anonymity. For anonymity, the protocol is very strong; for routing around censorship, it's fairly straightforward to defeat. That's not at all a criticism of the Tor developers; Tor was designed to achieve anonymity, and just turned out to work for beating censorship as well -- but only, of course, as long as the censors aren't making much effort to block it.
Which all leads to the obvious question: Why have the censors not bothered?
Nobody knows for sure, but I fear the answer is that the Chinese government and other censors know that the greatest weapon in their arsenal is not IP blocking, or keyword filtering, or even the threat of arrest. It's just apathy. The Chinese censors know what we anti-censorware developers in the free world keep forgetting: that most Chinese are not liberty-minded Jeffersonians chomping at the bit under the oppressive yoke of their government and waiting to be freed by circumvention software. As Michael Chase and James Mulvenon of the RAND Corporation put it in their report on Internet usage by Chinese dissidents, You've Got Dissent!: "[A]lthough some peer-to-peer applications... are designed specifically to combat censorship on the Internet and address privacy concerns, most Chinese Internet users are undoubtedly more interested in using peer-to-peer applications for entertainment purposes such as downloading MP3 music files." The censors know what Netscape knew when they fought tooth and nail against Microsoft including Internet Explorer on the desktop of every Windows machine: defaults matter. It doesn't matter that users can go to Netscape's site and download their browser, and it doesn't matter that users can access a banned site by installing a cool p2p program. Most people just don't.
When I first started working on the Circumventor, I assumed that since the Chinese Internet censorship bureau reportedly employed about 30,000 people, surely if they were already spending that much effort and money, they'd throw plenty of resources at defeating any new anti-censorship program, so the Circumventor would have to be able to withstand any such attack. But I was wrong. According to the RAND corporation paper, the censors have been quite busy, for example, policing political forums for dissident postings that other users might casually run into. But they apparently assume -- correctly, it seems -- that content doesn't pose much of a threat if users have to go out of their way and download a program to access it. And if the user has to have a friend outside the country to help them, then forget it.
This is not to downplay the enormous good that programs like Tor, Circumventor and Psiphon can do in bringing free speech to the people in censored countries who want it. But it's easy to forget that those often do not comprise a large part of the population.
One of the biggest disappointments for me came in May 2005 when I was looking for ways to get around the word filter on MSN China's blogging service. Microsoft, apparently acting on public relations advice from Lex Luthor, had decided to filter the words "freedom", "democracy", and "Taiwan independence" from the titles of blogs on MSN China. (I know, I know, they have to comply with Chinese laws to do business there. But I don't think the Chinese have actually outlawed the word "democracy".) Eventually I did find a loophole, so I searched on MSN for some Chinese blogs published by expatriates to ask them to help test the workaround for me. With a few exceptions, most of the bloggers were rather hostile, saying that they supported their government's efforts to censor the Internet and to stamp out Falun Gong as a dangerous "cult". (These were expats living in the U.S., so presumably they were not worried about the Chinese government sending a tank across the Pacific to run them over if they criticized the ruling party. Even if they thought they had to watch what they said because they might someday return to China, or because they still had family there, surely it would have been easier just to ignore me; the hostility that I encountered sounded genuine.) The moral is, no matter how much your movement believes in its efforts to help oppressed people, you can't just assume you'll be greeted as liberators (ahem).
So now you know most of what there is to know about the state of the art in anti-censorship software. It's just that there is less to understand than the hype originally suggests -- the programs aren't really secure, but they work because the censors aren't really trying. And there aren't any cool mathematical formulas that you can impress your friends with -- for that, you'll still have to go back to Applied Cryptography. It's a lot less impressive to be the Bruce Schneier of circumvention algorithms than it is to be the real Bruce Schneier.
-
Behind the Magic of Anti-Censorship Software
Regular Slashdot contributor Bennett Haselton writes in to say "The December 1st release of Psiphon has sparked renewed interest in the various software programs that can help circumvent Internet censorship in China, Iran, and other censored countries. (Some of this interest undoubtedly being motivated by the fact that many of these programs also work for getting around blocking software at work or school.) Have you ever wanted to understand the science behind these programs, the way that mathematicians and codebreakers understand the magic behind PGP? If you loved the mental workout of reading "Applied Cryptography", have you ever wanted a tutorial to do the same for Psiphon and Tor and other anti-censorship programs?" The rest of his editorial follows.Well, here's a primer, but you might be disappointed. Like making the Statue of Liberty disappear, it doesn't sound very cool once you know how it's done; the truth is that most anti-censorship programs, including mine, only work because the censors are not trying very hard.
(Note that I am going to be talking about ways that certain anti-censorship programs can be defeated. I don't believe that this is giving much help to censors, because these are obvious weaknesses that would occur to anyone who knows how the programs work. For reasons I'll get into at the end, I don't think these weaknesses actually make much difference.)
Basically, all anti-censorship programs fall into two categories: those that require you to have a helper outside of the censored country, and those that don't.
Take Psiphon. To use Psiphon, someone in a non-censored country has to install it on their home computer, which turns their computer into a Web server with an interface similar to Anonymouse.org, where you type in the URL of the page you want to view and it fetches it for you. The difference, of course, is that Anonymouse.org is widely known and blocked by any self-respecting Internet filtering system, while your newly created Psiphon URL pointing to your home computer is not blocked anywhere, yet. So if you set up a Psiphon URL on your computer in the U.S. and e-mail it to your friend in China, your friend can use it to surf wherever they want. (Note that this also has the desirable property that the person in China doesn't have to install any software, so they can use the URL even from a cybercafe computer with restricted user permissions.) The hurdle, of course, is that the person in China has to have a contact outside the country to help them. This is not a huge barrier for many Chinese, but it still means the program doesn't have the instant gratification property of something that you turn on and it just works.
Peacefire, by the way, had released the Circumventor program in 2003 which did essentially the same thing. (And the Circumventor was itself really just a wizard for installing a Web server with James Marshall's CGIProxy script, which deserves most of the credit, although the Circumventor did help bring it "to the masses", since most users don't have the ability to set up an SSL-enabled Web server themselves.) Psiphon made some improvements, namely:
- Ability to create password-protected accounts to restrict the URL to certain users.
- Smaller download (although it may not matter much since only broadband users would be installing it anyway).
- Ability to run on Linux. (Circumventor only works on Windows, although you can install CGIProxy on a Linux webserver if you know how.)
- A wizard to help users forward incoming connections on their router and enter exceptions in software firewalls to make the software work. (If they want to. No tweaking people's firewall settings without asking them!)
- Slightly harder to block, due to some strategies such as using a different SSL certificate for each install (Psiphon uses the same one each time).
And both programs fall victim to the same attacks, although as far as I know, none of these have been implemented in practice:
- Blocking sites whose SSL certificates do not match the site hostname (easier for a censoring proxy server like the ones used in the Middle East, than for an IP firewall like the Great Firewall of China).
- Blocking outgoing Web connections to residential IP address ranges like Comcast.
But basically, they're the same program -- so the difference in press coverage has been illustrative of how much context matters to reporters. Psiphon is the "politically correct" version -- they've played down the fact that it can be used to get around blocking software in schools and played up the fact that it can be used to beat the censors in China and Iran, and the press coverage has focused exclusively on that human rights aspect. The Circumventor was also written to help foreign victims of censorship, and articles have been written about its uses for that purpose, but I've also been unapologetically promoting its use to get around blocking software at home and in school, as part of an advocacy for greater civil rights for people under 18. (Also because the more installations there are in the U.S., the more it helps users abroad.) As a result, some of the TV news pieces about it have used such ominous music and lighting that they practically looked like recycled footage from "To Catch a Predator". Of course, Psiphon can be used for exactly the same thing. (I also emailed some of the reporters who recently wrote about Psiphon, to tell them about Circumventor; so far, I haven't heard back from any of them, but I doubt they're being politically correct this time, I think they're just not thrilled that C-Net scooped them by three years and seven months.)
So, Psiphon and Circumventor fall in the first category -- programs that only work if you've got a contact outside the censored country to help you. In the second category is Tor, which was originally written to provide mathematically secure anonymity, but had the nice property that it could be used to get around the Great Firewall of China as well. With your browser in China using Tor as a proxy, packets are routed to other Tor nodes outside the country, which connect you with any blocked Web site that you want to see. Best of all, you just install it on a machine in China, and presto, it works, no nagging your expat cousin in the U.S. to install something on their computer to help you. Dynamic Internet Technologies, run by Chinese dissident Bill Xia in North Carolina, runs another service that works "out of the box" -- you send an instant-message to one of the DIT screen names, and it replies with a list of currently running Web proxies. (Bill has asked me not to publicize the actual screen names that perform this service, because it's intended only for Chinese users. I think that's a case of "security through obscurity", but I respect his wishes.)
Unfortunately, all such "instant gratification" solutions have the same basic weakness, which by a simple argument can be extended even to hypothetical future programs in the same category. In the case of a program like Tor, the censor only has to install the software, look at what IP addresses the software connects to when it bootstraps itself, and add those IP addresses to the blacklist. Even if the software chooses at random from multiple IP addresses to bootstrap to, the censor can still obtain all of them by repeatedly re-installing the software (possibly wiping the machine each time so the software can't tell that it's been installed before). No matter how you slice it, if Alice the legitimate user and Bob the censor download the program on the same day, Bob can make the program not work for Alice if he updates the blacklist quickly enough. He doesn't even have to reverse-engineer the software, he just has to use a network sniffer to see where it connects to. (For DIT's proxy-by-instant-message system, the censor can instant-message the screen name repeatedly, from different accounts, until they've collected and blocked all the available proxies; this would be analogous to re-installing Tor repeatedly and seeing what IPs it connects to.)
Peacefire has produced other approach which is a simple, obvious idea, and it was quite by accident that we found out it slips through the cracks of the seemingly "unsolvable" problem with instant-gratification outlined above. Like the other solutions, it works only as long as the censors are fairly lazy, but they are, and it does. About 30,000 people have signed up through a form on our site to be notified each time we create a new Circumventor site and mail it out, every 3 or 4 days. Agents of the blocking companies have joined the list too, of course, but we mail different sites to different subsets of the list. Now, an attack analogous to the attacks listed in the previous paragraph, would be for the censors to join under many different accounts, and then block any site that gets mailed to any of those accounts. But the catch is that when an address joins the list, a new site doesn't get mailed to that address until some random time in the future. So the censor has to check all of the fake Hotmail accounts that they've created, over and over, if they want to block all of the new sites as soon as they're released. Hardly impossible, but the censor can no longer use the instantaneous approach of: (1) enter the system / join the list / install the software; (2) see where it connects to and block those points of access; (3) repeat. (If we instantly e-mailed a randomly selected site to each new signup, then this attack would work.) By going from instant gratification to almost-instant-gratification, you change one of the conditions for the theorem stated in the previous paragraph, so that it no longer holds true. Still, like Tor and the DIT system, it could be blocked with a moderate amount of effort.
The Tor protocol, by the way, has been the subject of a great deal of sophisticated mathematical analysis, really brainy stuff that is beyond the scope of this article. But it's important to understand that that analysis focuses on the security of the Tor protocol for achieving anonymity. For anonymity, the protocol is very strong; for routing around censorship, it's fairly straightforward to defeat. That's not at all a criticism of the Tor developers; Tor was designed to achieve anonymity, and just turned out to work for beating censorship as well -- but only, of course, as long as the censors aren't making much effort to block it.
Which all leads to the obvious question: Why have the censors not bothered?
Nobody knows for sure, but I fear the answer is that the Chinese government and other censors know that the greatest weapon in their arsenal is not IP blocking, or keyword filtering, or even the threat of arrest. It's just apathy. The Chinese censors know what we anti-censorware developers in the free world keep forgetting: that most Chinese are not liberty-minded Jeffersonians chomping at the bit under the oppressive yoke of their government and waiting to be freed by circumvention software. As Michael Chase and James Mulvenon of the RAND Corporation put it in their report on Internet usage by Chinese dissidents, You've Got Dissent!: "[A]lthough some peer-to-peer applications... are designed specifically to combat censorship on the Internet and address privacy concerns, most Chinese Internet users are undoubtedly more interested in using peer-to-peer applications for entertainment purposes such as downloading MP3 music files." The censors know what Netscape knew when they fought tooth and nail against Microsoft including Internet Explorer on the desktop of every Windows machine: defaults matter. It doesn't matter that users can go to Netscape's site and download their browser, and it doesn't matter that users can access a banned site by installing a cool p2p program. Most people just don't.
When I first started working on the Circumventor, I assumed that since the Chinese Internet censorship bureau reportedly employed about 30,000 people, surely if they were already spending that much effort and money, they'd throw plenty of resources at defeating any new anti-censorship program, so the Circumventor would have to be able to withstand any such attack. But I was wrong. According to the RAND corporation paper, the censors have been quite busy, for example, policing political forums for dissident postings that other users might casually run into. But they apparently assume -- correctly, it seems -- that content doesn't pose much of a threat if users have to go out of their way and download a program to access it. And if the user has to have a friend outside the country to help them, then forget it.
This is not to downplay the enormous good that programs like Tor, Circumventor and Psiphon can do in bringing free speech to the people in censored countries who want it. But it's easy to forget that those often do not comprise a large part of the population.
One of the biggest disappointments for me came in May 2005 when I was looking for ways to get around the word filter on MSN China's blogging service. Microsoft, apparently acting on public relations advice from Lex Luthor, had decided to filter the words "freedom", "democracy", and "Taiwan independence" from the titles of blogs on MSN China. (I know, I know, they have to comply with Chinese laws to do business there. But I don't think the Chinese have actually outlawed the word "democracy".) Eventually I did find a loophole, so I searched on MSN for some Chinese blogs published by expatriates to ask them to help test the workaround for me. With a few exceptions, most of the bloggers were rather hostile, saying that they supported their government's efforts to censor the Internet and to stamp out Falun Gong as a dangerous "cult". (These were expats living in the U.S., so presumably they were not worried about the Chinese government sending a tank across the Pacific to run them over if they criticized the ruling party. Even if they thought they had to watch what they said because they might someday return to China, or because they still had family there, surely it would have been easier just to ignore me; the hostility that I encountered sounded genuine.) The moral is, no matter how much your movement believes in its efforts to help oppressed people, you can't just assume you'll be greeted as liberators (ahem).
So now you know most of what there is to know about the state of the art in anti-censorship software. It's just that there is less to understand than the hype originally suggests -- the programs aren't really secure, but they work because the censors aren't really trying. And there aren't any cool mathematical formulas that you can impress your friends with -- for that, you'll still have to go back to Applied Cryptography. It's a lot less impressive to be the Bruce Schneier of circumvention algorithms than it is to be the real Bruce Schneier.
-
Behind the Magic of Anti-Censorship Software
Regular Slashdot contributor Bennett Haselton writes in to say "The December 1st release of Psiphon has sparked renewed interest in the various software programs that can help circumvent Internet censorship in China, Iran, and other censored countries. (Some of this interest undoubtedly being motivated by the fact that many of these programs also work for getting around blocking software at work or school.) Have you ever wanted to understand the science behind these programs, the way that mathematicians and codebreakers understand the magic behind PGP? If you loved the mental workout of reading "Applied Cryptography", have you ever wanted a tutorial to do the same for Psiphon and Tor and other anti-censorship programs?" The rest of his editorial follows.Well, here's a primer, but you might be disappointed. Like making the Statue of Liberty disappear, it doesn't sound very cool once you know how it's done; the truth is that most anti-censorship programs, including mine, only work because the censors are not trying very hard.
(Note that I am going to be talking about ways that certain anti-censorship programs can be defeated. I don't believe that this is giving much help to censors, because these are obvious weaknesses that would occur to anyone who knows how the programs work. For reasons I'll get into at the end, I don't think these weaknesses actually make much difference.)
Basically, all anti-censorship programs fall into two categories: those that require you to have a helper outside of the censored country, and those that don't.
Take Psiphon. To use Psiphon, someone in a non-censored country has to install it on their home computer, which turns their computer into a Web server with an interface similar to Anonymouse.org, where you type in the URL of the page you want to view and it fetches it for you. The difference, of course, is that Anonymouse.org is widely known and blocked by any self-respecting Internet filtering system, while your newly created Psiphon URL pointing to your home computer is not blocked anywhere, yet. So if you set up a Psiphon URL on your computer in the U.S. and e-mail it to your friend in China, your friend can use it to surf wherever they want. (Note that this also has the desirable property that the person in China doesn't have to install any software, so they can use the URL even from a cybercafe computer with restricted user permissions.) The hurdle, of course, is that the person in China has to have a contact outside the country to help them. This is not a huge barrier for many Chinese, but it still means the program doesn't have the instant gratification property of something that you turn on and it just works.
Peacefire, by the way, had released the Circumventor program in 2003 which did essentially the same thing. (And the Circumventor was itself really just a wizard for installing a Web server with James Marshall's CGIProxy script, which deserves most of the credit, although the Circumventor did help bring it "to the masses", since most users don't have the ability to set up an SSL-enabled Web server themselves.) Psiphon made some improvements, namely:
- Ability to create password-protected accounts to restrict the URL to certain users.
- Smaller download (although it may not matter much since only broadband users would be installing it anyway).
- Ability to run on Linux. (Circumventor only works on Windows, although you can install CGIProxy on a Linux webserver if you know how.)
- A wizard to help users forward incoming connections on their router and enter exceptions in software firewalls to make the software work. (If they want to. No tweaking people's firewall settings without asking them!)
- Slightly harder to block, due to some strategies such as using a different SSL certificate for each install (Psiphon uses the same one each time).
And both programs fall victim to the same attacks, although as far as I know, none of these have been implemented in practice:
- Blocking sites whose SSL certificates do not match the site hostname (easier for a censoring proxy server like the ones used in the Middle East, than for an IP firewall like the Great Firewall of China).
- Blocking outgoing Web connections to residential IP address ranges like Comcast.
But basically, they're the same program -- so the difference in press coverage has been illustrative of how much context matters to reporters. Psiphon is the "politically correct" version -- they've played down the fact that it can be used to get around blocking software in schools and played up the fact that it can be used to beat the censors in China and Iran, and the press coverage has focused exclusively on that human rights aspect. The Circumventor was also written to help foreign victims of censorship, and articles have been written about its uses for that purpose, but I've also been unapologetically promoting its use to get around blocking software at home and in school, as part of an advocacy for greater civil rights for people under 18. (Also because the more installations there are in the U.S., the more it helps users abroad.) As a result, some of the TV news pieces about it have used such ominous music and lighting that they practically looked like recycled footage from "To Catch a Predator". Of course, Psiphon can be used for exactly the same thing. (I also emailed some of the reporters who recently wrote about Psiphon, to tell them about Circumventor; so far, I haven't heard back from any of them, but I doubt they're being politically correct this time, I think they're just not thrilled that C-Net scooped them by three years and seven months.)
So, Psiphon and Circumventor fall in the first category -- programs that only work if you've got a contact outside the censored country to help you. In the second category is Tor, which was originally written to provide mathematically secure anonymity, but had the nice property that it could be used to get around the Great Firewall of China as well. With your browser in China using Tor as a proxy, packets are routed to other Tor nodes outside the country, which connect you with any blocked Web site that you want to see. Best of all, you just install it on a machine in China, and presto, it works, no nagging your expat cousin in the U.S. to install something on their computer to help you. Dynamic Internet Technologies, run by Chinese dissident Bill Xia in North Carolina, runs another service that works "out of the box" -- you send an instant-message to one of the DIT screen names, and it replies with a list of currently running Web proxies. (Bill has asked me not to publicize the actual screen names that perform this service, because it's intended only for Chinese users. I think that's a case of "security through obscurity", but I respect his wishes.)
Unfortunately, all such "instant gratification" solutions have the same basic weakness, which by a simple argument can be extended even to hypothetical future programs in the same category. In the case of a program like Tor, the censor only has to install the software, look at what IP addresses the software connects to when it bootstraps itself, and add those IP addresses to the blacklist. Even if the software chooses at random from multiple IP addresses to bootstrap to, the censor can still obtain all of them by repeatedly re-installing the software (possibly wiping the machine each time so the software can't tell that it's been installed before). No matter how you slice it, if Alice the legitimate user and Bob the censor download the program on the same day, Bob can make the program not work for Alice if he updates the blacklist quickly enough. He doesn't even have to reverse-engineer the software, he just has to use a network sniffer to see where it connects to. (For DIT's proxy-by-instant-message system, the censor can instant-message the screen name repeatedly, from different accounts, until they've collected and blocked all the available proxies; this would be analogous to re-installing Tor repeatedly and seeing what IPs it connects to.)
Peacefire has produced other approach which is a simple, obvious idea, and it was quite by accident that we found out it slips through the cracks of the seemingly "unsolvable" problem with instant-gratification outlined above. Like the other solutions, it works only as long as the censors are fairly lazy, but they are, and it does. About 30,000 people have signed up through a form on our site to be notified each time we create a new Circumventor site and mail it out, every 3 or 4 days. Agents of the blocking companies have joined the list too, of course, but we mail different sites to different subsets of the list. Now, an attack analogous to the attacks listed in the previous paragraph, would be for the censors to join under many different accounts, and then block any site that gets mailed to any of those accounts. But the catch is that when an address joins the list, a new site doesn't get mailed to that address until some random time in the future. So the censor has to check all of the fake Hotmail accounts that they've created, over and over, if they want to block all of the new sites as soon as they're released. Hardly impossible, but the censor can no longer use the instantaneous approach of: (1) enter the system / join the list / install the software; (2) see where it connects to and block those points of access; (3) repeat. (If we instantly e-mailed a randomly selected site to each new signup, then this attack would work.) By going from instant gratification to almost-instant-gratification, you change one of the conditions for the theorem stated in the previous paragraph, so that it no longer holds true. Still, like Tor and the DIT system, it could be blocked with a moderate amount of effort.
The Tor protocol, by the way, has been the subject of a great deal of sophisticated mathematical analysis, really brainy stuff that is beyond the scope of this article. But it's important to understand that that analysis focuses on the security of the Tor protocol for achieving anonymity. For anonymity, the protocol is very strong; for routing around censorship, it's fairly straightforward to defeat. That's not at all a criticism of the Tor developers; Tor was designed to achieve anonymity, and just turned out to work for beating censorship as well -- but only, of course, as long as the censors aren't making much effort to block it.
Which all leads to the obvious question: Why have the censors not bothered?
Nobody knows for sure, but I fear the answer is that the Chinese government and other censors know that the greatest weapon in their arsenal is not IP blocking, or keyword filtering, or even the threat of arrest. It's just apathy. The Chinese censors know what we anti-censorware developers in the free world keep forgetting: that most Chinese are not liberty-minded Jeffersonians chomping at the bit under the oppressive yoke of their government and waiting to be freed by circumvention software. As Michael Chase and James Mulvenon of the RAND Corporation put it in their report on Internet usage by Chinese dissidents, You've Got Dissent!: "[A]lthough some peer-to-peer applications... are designed specifically to combat censorship on the Internet and address privacy concerns, most Chinese Internet users are undoubtedly more interested in using peer-to-peer applications for entertainment purposes such as downloading MP3 music files." The censors know what Netscape knew when they fought tooth and nail against Microsoft including Internet Explorer on the desktop of every Windows machine: defaults matter. It doesn't matter that users can go to Netscape's site and download their browser, and it doesn't matter that users can access a banned site by installing a cool p2p program. Most people just don't.
When I first started working on the Circumventor, I assumed that since the Chinese Internet censorship bureau reportedly employed about 30,000 people, surely if they were already spending that much effort and money, they'd throw plenty of resources at defeating any new anti-censorship program, so the Circumventor would have to be able to withstand any such attack. But I was wrong. According to the RAND corporation paper, the censors have been quite busy, for example, policing political forums for dissident postings that other users might casually run into. But they apparently assume -- correctly, it seems -- that content doesn't pose much of a threat if users have to go out of their way and download a program to access it. And if the user has to have a friend outside the country to help them, then forget it.
This is not to downplay the enormous good that programs like Tor, Circumventor and Psiphon can do in bringing free speech to the people in censored countries who want it. But it's easy to forget that those often do not comprise a large part of the population.
One of the biggest disappointments for me came in May 2005 when I was looking for ways to get around the word filter on MSN China's blogging service. Microsoft, apparently acting on public relations advice from Lex Luthor, had decided to filter the words "freedom", "democracy", and "Taiwan independence" from the titles of blogs on MSN China. (I know, I know, they have to comply with Chinese laws to do business there. But I don't think the Chinese have actually outlawed the word "democracy".) Eventually I did find a loophole, so I searched on MSN for some Chinese blogs published by expatriates to ask them to help test the workaround for me. With a few exceptions, most of the bloggers were rather hostile, saying that they supported their government's efforts to censor the Internet and to stamp out Falun Gong as a dangerous "cult". (These were expats living in the U.S., so presumably they were not worried about the Chinese government sending a tank across the Pacific to run them over if they criticized the ruling party. Even if they thought they had to watch what they said because they might someday return to China, or because they still had family there, surely it would have been easier just to ignore me; the hostility that I encountered sounded genuine.) The moral is, no matter how much your movement believes in its efforts to help oppressed people, you can't just assume you'll be greeted as liberators (ahem).
So now you know most of what there is to know about the state of the art in anti-censorship software. It's just that there is less to understand than the hype originally suggests -- the programs aren't really secure, but they work because the censors aren't really trying. And there aren't any cool mathematical formulas that you can impress your friends with -- for that, you'll still have to go back to Applied Cryptography. It's a lot less impressive to be the Bruce Schneier of circumvention algorithms than it is to be the real Bruce Schneier.