How About an Intelligent Open Source Filter?
If we must censor content on the Internet, I would feel better about it (but not much) knowing that the censorship was done by the people rather than some bureaucrat in Congress.
Comment by michael : Many anti-censorship folks have been pushing this line for a long time; that the blacklists used in public institutions should, at the minimum, be open for public inspection. This would, no doubt help cure some of the more egregious errors. But the above poster is making an error in his reasoning. Computers do the searching because there is no other way to do it - you simply cannot categorize hundreds of millions of pages by hand, period, end of sentence. And an algorithmic approach can never fully characterize the range of human expression present on the Web - even assuming, for the moment, that you could get people to agree on what should or should not be censored, there's no way to make rules which will pick out those pages with 100% accuracy, or even anything close to that. Doing so would require the development of true artificial intelligence, which isn't even on the horizon. Calling something "open source" or not doesn't make it magically able to achieve a breakthrough in artificial intelligence. When you add in the fact that with three people in a room you have four different opinions on what should and should not be censored, it should become clear that throwing an open source label at something is not going to result in an easy solution.
I don't really care about seeing porn from time to time and I don't have any kids. What I would like though is a good distributed web moderation system.
This is already done to some degree with some search engines, especially specialized ones for books or movies. But I would love to see something built into my browser that would allow me (securely and privately of course) to set my preferences
(intelligent > 2 | funny > 4 ) & (commercial 1)
so that links to other stuff are colored differently. Of course obtaining moderator status would be hard, so it would probably have to be a firefly kind of thing where you're views are matched with others of your same views (securely and privately of course). Consider it the poor man's intelligent agent.
--
"L'IT c'est moi!"
"You can't categorize millions of pages by hand, period."
Huh? What happened to the Open Ideal of "many hands, many eyes, many minds"? How many millions of lines of code have been "categorized" by how many programmers? How many hours of thought are behind each one?
How many CD id numbers and associated track lists have been categorized by users of the free(and non-free) CD databases?
What experiences and thought processes make you so sure that it can't be done, period?
Good judgement comes from experience, and experience comes from bad judgement.
- W. Wriston, former Citibank CEO
When Junkbuster achieves popularity, it will be subverted.
Besides, Junkbuster works because many of the ads include in their URL obvious clues like *ad*, *banner*, *promo* and are sent from centralized sites like DoubleClick and other media brokers.
Porn, racist URL can be much more diverse, and they usually lead to same-site places. It would be difficult to list every site that runs censurable pages.
__
__
Men with no respect for life must never be allowed to control the ultimate instruments of death.
GW Bu
That seems similar to PICS (not that I know so much about PICS).
But you would be blocking entire sites that use dynamic links like Slashdot.
Slashdot can contain censurable comments everywhere and they can have lots of URLs, because they embed parameters into the URL.
As Tim Berners-Lee says somewhere in W3C, URLs should not be stuffed with representation stuff, that should be negotiated.
__
__
Men with no respect for life must never be allowed to control the ultimate instruments of death.
GW Bu
Why do we call these programs "filters"? Because they "clean" the internet. However, that makes one grand assumption - that the internet contains an abbundance of dirtyness, that for some reason should not be viewed by human eyes.
The fallacy in this reasoning is the idea that information can be dirty or clean, impure or proper, amoral or moral. I submit that all information is neither - that it only becomes so in the mind of the viewer.
One thing I have noticed in my 27 years of life (which I realize is by no means a long life view, but it is what I have to work with) is that those people who are the most vocal about being and living "clean", tend to be the same ones who are secretly "dirty" behind the back of society. Those people who oppose or don't care about the issue are labled as "dirty" by these same people, because these "unclean" people represent a mirror to them of the way they really are. They feel that if they could rid the world of "dirtyness", that they themselves might become clean, and could thus dispense with their secret.
This idea is a perversion of logic. Those people that do this generally have failed to be honest with themselves and others. They ususally don't realize that by being truthful (though it may be painful) they can rid themselves of the issue. The rest of the "dirty" world has already realized this.
We cannot pretend to protect children and adults from things which, even if severely impeded by all technical means (short of brain modification - which I am sure is coming), would still arise in their thoughts anyway (show me child above the age of six who has never thought about sex - we are kidding ourselves if we believe that children don't). I find it amusing at the number of people who rail and rise against the whole issue of porn - who never consider that the porn will always be there, because it is supply and demand. Do these people really think they can rid mankind from viewing sex - when it is the thought of sex and the pleasure which is hard wired into our minds to make us procreate? If sex wasn't important, why would it be pleasurable?
I don't think we need filters. I think we need more rationality and intelligence.
Reason is the Path to God - Anon
But in the minds of those who want to impose filters onto the rest of the population, filters form a way for them to delude themselves into thinking that a filter will make themselves "clean", by "removing" the "mirror" that stares back at them, reminding them of their own secrets.
I take sympathy with your issues on "bait-and-switch" sites - I have experienced similar sites, but I wouldn't want to impose upon myself a safeguard against such sites, as they only "suprise" me rarely (it would be akin to keeping a small boat in the middle of a desert, just in case it floods). You mentioned you got suprised by a troll. Without knowing the link to the site, did you by any means check to see where the link was pointing to before clicking on it? Also, did you suspect the individual was a troll prior to clicking on the link (sometimes I click on troll's links, just to see where they take me - but I am never suprised by where they go)?
The only time I could see that you would want a filter for yourself would be in a work environment - one which is so draconian they monitor every move you make (and log all internet traffic). If that was a problem, I would wonder why you would continue to work there. No job is worth that kind of paranoia...
Reason is the Path to God - Anon
When a library buys a filter from a company which keeps its filter list secret, this principle of public accountability is violated. This looks to me like it could be grounds for a citizen lawsuit against a library, demanding that the filter list be published or the filtering be removed. This could serve both ways; people who don't want legitimate information restricted could hammer on the filter companies for their abuses, and people who do want porn, violence, hate speech or whatnot blocked could make certain that there aren't any critical omissions in the lists either. But I think the most likely outcome is that the filter companies would stop selling to libraries, along with a small flurry of lawsuits by people and organizations like Peacefire against companies like Mattel, for libelling them in their filter classifications (Peacefire, pornographic and violent? HAH!). To keep their filter lists "secret" (as if they'll be safe from the amateur cryptographers and hackers!) they'll have to stop selling to public organizations. Voila, the problem is stamped out at the source.
--
Time is Nature's way of keeping everything from happening at once... the bitch.
There is GroupLens to apply something like this to USENET. Check out their work.
Unfortunatly, the only work I know going on with this is a few professors that I had in college. You can look at their web pages at: Joseph Konstan, John Riedl. The latter site has a lot more information. (ie a link to something)
It is usenet only, but I think this is a way to start. Then we just need a way to rate pages that everyone works on, and can agree to partially. If 100 people call a web site pron, and 10 call it interesting, I'd not want my children to see it. If 5 call it porn, and 100 call it interesting, it might be interesting, but not viewable (by my children who I want to protect more then most /. readers) without a parent to decide if the child needs to know about brest cancer in that level of detail. (TO give an example of where useful information and porn can cross).
Then there is violence. I wouldn't want to view any "violent" web site myself, if the site was movies and pictures of one person murdering anouther. However if the website was hunting videos, I personally consider that normal content and would let anyone of any age see it. A colarabative filter allows me (with time) to build up a personal database cross references with others. then it can say "100 other parents who tend to think like you have called this [murder] site bad" vs "Many people call this [hunting] site violent, but those who tend to agree with you recomend it." If the entire world rates every site they visit out of habit, this could be useful.
Note that above I intentially stuck to single issures. The hunting site with naked hunters is unaccaptable (to me), even though the hunting content may be good. Filtering software must be complex enough to handle all this issues, and yet be simple enough that people use it.
The idea here is that you can control whos database you search, and exactly how you search. eg: If you want to eliminate all sites that mention Libertarianism and have a 10 in their IP address, you should be able to do so.
This would give the user to pick and choose whos database best met their screening requirements, AND be able to control what sort of screening was being done.
My suggestion for the database format would be a simple one:
Main Record:
The owners of the database would be responsible for how the values are given to the site. Some may use automated searches, others may elect to screen pages themselves, yet others may elect to ask contributers to visit the sites and rate them.
the idea of having so many different scores is so that you can include sites that would otherwise be excluded. eg: Medical sites dealing with AIDS issues are going to mention sex. On the other hand, the information is also clearly both medical and educational and so would have a rating in both of those categories, too.
So, to search for AIDS stuff, you'd probably want a filter that you could instruct to exclude any sites with a sex score > (medical + educational).
This doesn't deal with bogus positives, produced by people cloning the title page, and then using a redirect to send you to the "real" site. However, you can filter out those, by simply adding the following set of records:
Index record:
Fields 1 and 2 form the combined key for the record.
The database then computes the values, not just of that page, but of all pages to a user-specified depth (not exceeding some sensible value, or you'd end up with a DOS attack). The values used would then be some weighted average of the values obtained. This would remove the risk of fraudulant title pages, or other forms of deliberate deception.
Because this system is so much under the control of the user, it is not, in any sense of the word, blocking free speech, or censoring anyone. The user has specifically selected the database and the criteria for exclusion. Anything blocked, then, is blocked in full knowledge and by the deliberate hand of the user, who should (by rights) have every right to say what they don't experience.
Also, by having neworked databases, anyone can set a database up. If someone sets up a system that others feel is unfair, biased or unreasonably censoring, they're free to set up their own database. If the users agree with their opinion, they'll switch. If they don't, well, just too bad. They're entitled to their opinions, too, even if that means disagreeing.
This system could be extended, with a very simple front-end, to be a search engine as well as a filter. It works both ways. And, because of its design, you will get far fewer bogus hits. Instead of a few thousand hits, most of which are irrelevent sites and/or scams, you'll get just what you want, because you've screened out everything else.
It's a small world and it smells funny; I'd buy another if it wasn't for the money; Take back what I paid (SoM)