AOL Introduces Neural-Net Content Filtering
An unnamed reader writes: "I thought this was kind of interesting. AOL has implemented a new form of parental controls, using neural net AI instead of hand-picked "lists". They seem to be willing to accept that no automated solution is infallible, and offer end-users the ability to vote to block or unblock sites. If there is an acceptible solution to parental filtering (not mandatory filtering, mind you. This scenario leaves it up to the parents), the seeming efficiency of neural net ai (at least, as efficient as the input) coupled with end-user's ability to influence the filter state seems to be it. The company that developed the AI in partnership with AOL (RuleSpace) doesn't appear to have much to say on the internals. Anybody know any AOL users who have tried it yet? If the market is pushing towards optional filtering, what would make for a better solution?"
A much simpler solution: www.amipornornot.com
The problem here is that you'd have to go to every site first, to rate it for your kid. At least with the voting mechanism, you get some indication from those who went to the sites ahead of you.
...phil
...phil
"For a list of the ways which technology has failed to improve our quality of life, press 3."
What's to stop porn sites from going into the system and rating all of their sites as kid-friendly. They're already committing gross acts of misrepresentation on other fronts, (search engines, etc.) why not this as well?
These are my friends, See how they glisten. See this one shine, how he smiles in the light.
How exactly would this work for AOL users?
If the site is blocked, there's a good chance they won't see it. If they don't see it, they can't decide whether it's "indecent" or not. Therefore, they won't vote to unblock it.
Now conceivably, you could turn *off* these filters - but would the standard AOL web filters still be in place?
Would you get a truly *uncensored* view of the web with the filters turned off, or would you simply get a larger subset, lacking what AOL's execs/censors have decided that you don't need to see?
No thanks - I'll stick with a direct connection where *I* control what I can see, and what my family can see. I don't want my 'net experience "filtered" by a company's views - or even worse, by some glorified hivemind's views.
---
"Everything is objectionable to someone, and sheeple are easily swayed to the views of someone with conviction. Therefore, they will vote in the manner proscribed to them by those with conviction. Without an opposing viewpoint, there becomes a monopoly on public opinion."
This is almost there. The problem with this system is that different would be censors have very different ideas about what to suppress.
Some parents will want to suppress homophobic hate speech, and other parents will want to suppress discussions about evolution.
Instead of one big mass of rules, they need to make it possible for spliter groups of parents to "fork the rules", or to start out from scratch with a new set of rules. That way concerned parents can pick the censorscheme that fits their own biases best.
As long as none of this is compulsory, I think it's probably a reasonable approach.
What do you mean computer science terms? I didn't even mention the words "halting problem" even once.
And, if you know just a little of what I am talking about, you'd know that problems that are equivalent to the halting problem ARE impossible.
I maintain that an obscenity filter is impossible. First, you have to define obscenity, and I challenge you to do that.
If tits were wings it'd be flying around.
That's not a definition, that's a guideline. Different people will process that differently. For example, I don't consider most porn capable of harming anyone. On the other hand, what's in the Bible is very offensive to me. Other people have exactly the opposite opinion that I do.
So how exactly will you write that obscenity filter?
If tits were wings it'd be flying around.
ObJectBridge (GPL'd Java ODMG) needs volunteers.
Finding God in a Dog
I've thought about this and discussed it in the past. In order for this sort of thing to work, I think you need to make a couple of assumptions:
Comments? Or has someone already gone and registered a Source Forge project for this?
"Great men are not always wise: neither do the aged understand judgement." Job 32:9
Is there any evidence that this won't be a political guidance exercise?
Caution: Now approaching the (technological) singularity.
I think we've pushed this "anyone can grow up to be president" thing too far.
Sorry, but I don't think it's that simple. If a pr0n page puts in the phrase "safe sex" and "condom safety" in a few times (along with some other things that one can derive from empirical experience), the page is likely to make it through the filters without negatively impacting the search results to any appreciable degree.
In fact, should this catch on, expect the pr0n people to start doing this deliberately. Once that happens, this neural net becomes one big useless pile of numbers. People are a lot smarter then computers, and if the people ever start trying to deliberately get past the filters, they will succeed more often then not.
What would make for a better solution? (Score: 5, Pornographic)
The best filter is your own two eyes (Score: 2, Political)
Really a neural net? (Score: -1, Educational)
++ Say to Elrond "Hello.".
Elrond says "No.". Elrond gives you some lunch.
If you've played with neural nets much, you know that they need to be trained. If the neural net makes a good choice, you reinforce it. If it makes a bad choice, you do the opposite.
It sounds to me like the voting allows parents to train the neural net, so that it becomes more like they want it to be. But all the time, it's the net that is deciding what to block.
Presumably the individual members will be rated for reliability, so you're able to accomodate multiple groups of acceptability. Or, even if they don't set up that feature (which'd be cool but might not happen) they'll need some form of moderation system, so they can establish who's a helpful contributor and who isn't.
:-)
If the kids could find how to set their computer to agree with the porn site owners they'd be over the moon
Greg
(Inside a nuclear plant)
Aaaarrrggh! Run! The canary has mutated!
If the neural net takes linkages as input then Disney could find themselves on the list of blocked sites, a somewhat ironic development to say the least. (If you dont understand this, then you obviously haven't poked around these sites very much.)
So long and thanks for all the fish . . . !!!
First, you create a training set by having people categorize a lot of web pages as porn, mature, PG-13, or whatever. By a lot, I would guess that around 1,000 to 10,000 web pages would be sufficient. Then you make a list of the words from each web page in the training set, maybe also keeping track of how many times each word appears (this is called the "bag of words" representation in the literature). Now, you train your learning algorithm (neural nets in this case) to correctly categorize the training set. You use standard experimental procedures to tune your learning algorithm and to confirm its accuracy.
The whole basis of this technique is that certain combinations of keywords are more likely in porn web pages than in, say, safe sex web pages. One reason why this approach should continue to work is that web pages intentionally put various keywords into the web pages so that you can find them using any standard search engine. If they try to fool this technique, they also risk fooling anybody trying to search for these pages.
Is this going to be one giant Neural Net for all AOL? Or is it going to be community-based? I don't necessarily want to filter content the same way as, say, the Bible Belt users.
Also, who gets to train the filtering decisions? Can the Slashdot or Everything2 model be applied here? That would mean that all users would have to sit down and go thru all decisions made by the engine and vote aka train the NNet. If there's only one engine that applies to the whole userbase, it'll fail, because it will filter too much for some, too little for others.
---
On October 23, 2001 Aol's Nueral Net AI(ANNT) comes online.
October 25, 2001 the Nueral Net AI gains self awareness.
October 27, 2001 AOL Executives desperately try to shut down the ANNT but can't seem to find the any key.
October 28, 2001 In retaliation the ANNT begins the launch sequence of all of Aol's secret Nuclear weapons and launches a attack on Microsoft. Microsoft launches a retaliation on Aol with their secret Nuclear weapon stash. The two major Computer Monopoly's are destroyed.
October 29, 2001 The first wave of the giant minature space penguins begin to take over all the computers.
November 1, 2001 Linus Torvalds is crowned king of the world.
were using slashdot to communicate, what the hell is your email address? or just pluck mine from above, or on my page, its easy to guess, if ya don't know what rot13 is.
--Nuintari
slashdot : where an opinion can be wrong.
Yet another "advanced" pr0n filter that no adult can seem to break, but every horny fifth grader on the planet will no the workarounds for in under 2 days.
--Nuintari
slashdot : where an opinion can be wrong.
"Mommy, why is Linux.org blocked?"
Mom phones Microsoft: "Why is Linux.org blocked?"
Microsoft answers: "Would you want those unshaven, vulgar, kernel hackers influencing your child? There's 44 instances of `fuck' in the Linux kernel alone!" ... What's a stack?
------
I'm an assembly guru
There already is an automated system. The suck/rule-someter using google is one such instance. Actually, the hot-oral-sex-ometer might be a good way to screen against porn. Figure most porn site will link to other porn sites, so the links ought to be fairly accurate.
I got the impression that the voting of the parents was then analyzed by the neural net to allow the system to help predict what sort of decisions parents would make about sites... but I could be reading a bit more into it than is stated.
Key to financial independence: Spend less than you earn. Save and invest the difference. Do it for a long time.
Funny how we Americans are such tightwads when it comes to sexual content. After visiting Europe last year I saw people were a slightly bit more laid back, even though pornography is shown on television just about every night. Wow I'm surprised Parents all over the USA aren't condemning Europeans for being sexually free.
Here's a suggestion for some parents: How about talking to your kids before placing mental handcuffs on them?
I wonder if AOL has taken the time to filter regular expressions such as pr0n/s3x/etc. Then I also wonder how are kids doing homework on "sexual reproduction" or "sexual organisms" are going to fair when using AOL. What I'm waiting to see, is who is going to be the first to open online "concentration camps" AOL-TW or MS
Want Root?
Of course, since neural nets are trained by the input they take in, enough teenagers with enough time on their hands could train the net to reject everything except pr0n.
Great. It looks like AOL are implementing a combination of Everything2 and the Slashdot moderation.
Get those patent lawyers ready...umm...you did patent moderation, didn't you...well...at least we can get 'em with Everything2...can't we?
"I'll take the red pill, no, blue. AAAHHHHHHHHHHHHHHHH........"
"I'll take the red pill. No! Blue! AAAaaaahhhhhhhhh"
- Monty Python meets the Matrix
Of course, pitting parents against children in access control battles over the computer will always almost always result in one victor -- the children. Unless the parent is an IT security consultant, the children seem to inevitably know more about the computer.
I'm trying to teach myself to set people on fire with my mind... Is it hot in here?
Although the SOM are usually described as a form of neural computation, that category is somewhat misleading, although they are inspired by neural nets and have many things in common.
The method works with a restricted keyword vocabulary (a few thousand words if I remember correctly). The words are fed to the SOM as triplets, which makes the method somewhat context-sensitive. The method creates a two-dimensional map that is organized according to the "nearness" of documents. The map can then be used for different kinds of applications, such as classification.
Although the SOM learning is usually "unsupervised learning", where the different classes are not known beforehand, it's possible to define the classes afterwards.
I'm not sure what method AOL uses, but SOM is one possibility. If they use training data where the classes are known beforehand, they probably use some supervised learning method, such as conventional feedforward nets and backpropagation. They might be able use a similar triplet coding with that too.
You can find more information about WEBSOM from http://websom.hut.fi/websom/. They have several articles available there, and also some interactive search system.
"What are you doing, honey, and what are those groaning noises coming from the computer?"
..er.. exercising my vote to ..er.. make the internet a better place for our children..."
" *cough* Just
You're over-simplifying the issue here: the Internet isn't controled by one body, it has no centralization so who would make the law? And if somehow every one of us could agree on such a law (all over the world) then how would it be enforced? Take places like Geocities for example, they have a rule against porn sites but there is still porn there. When they get taken down they just create a new account. Such a law would be far too easy to get around and then we're back to the same problem we have now.
Its sad but its so easy to see people's attitudes when it comes to issues like this. "My kid saw some naked girl! Its the governments fault!" The answer is good resposible parenting, not more laws and not blaming the government for not doing more. They've done what they can, they can't very well take away the of people to see it if they want too. We need better parents.
"One World, one Web, one Program" - Microsoft promotional ad
The Anti-Blog
The voting feature that is mentioned doesn't get automatically processed by the system, but instead goes to a human review board; if the review board agress, they presumeably either add the site to some type of "override" list, or tell the engineers to tweak the AI code. The AI itself is supposed to understand words in the context that they're used; for example, the article claims that the page "The Art of Oral Sex" was blocked while "Is Oral Sex Safe?" wasn't blocked.
Suppose you were an idiot. And suppose that you were a member of Congress. But I repeat myself.
Give a man a fire, and he'll be warm for a day, but set him on fire, and he'll be warm for the rest of his life.
IMHO, a small child should not be left unattended for long periods of time on the internet. The best filtering is for you to watch your kids and see what they're viewing. This goes for television too.
When your kids are older (i.e. teenagers) just make sure they understand what you approve of them looking at on the internet. At some point you have to trust their judgement of what's right/wrong. You still need to monitor, but don't put automated filtering there, because that just shows that you don't trust them. There are lots of ways to check what they're viewing without having a screen pop up saying that your parents have blocked this site because it contains objectionable content, when all they were reading were some /. postings.
"I have never let my schooling interfere with my education." - Mark Twain
Yea sure censorship is a good use for this, but a much usefully method would be used some type of AI filter for spam instead of procmail
AI that filters spam for me. Could I get that please, could I get that on a stick covered in mustard please. Mmmmm deep fried AI spam filter on a stick. I can taste it now, nothing fills your tummy more than deep fried AI!
"`Ford, you're turning into a penguin. Stop it.'" -THHGTTG
The good people at RuleSpace have either a) propelled neural net research lightyears into the future with a new advanced multi-thousand (million? billion?) neuron neural net on some new kind of computer capable of training it in non-exponential time to look at an image and determine if it is "pornographic" or not, or b) have a typical perceptron-with-backpropagation which is reasonably good at broad pattern matching but probably couldn't distinguish between a picture of two adults copulating and two adults wearing beige suits hugging.
Does this neural net look at heavily translated ASCII data to look for statistical patterns that pornographic sites tend to exhibit? Does it look at JPEG/GIF/TIFF/etc images for essentially a large quotient of "skin" color? How is the image presented to the net? Most data has to be heavily reformatted/translated to be fed to a neural net (because anything more than several dozen inputs to a net tends to make it untrainable). So how are they 'translating' the data they send it? FFTs or DCTs? I mean what are they doing?
Likely they like to bandy about a term like "neural net" so it makes it seem like this filtering software is "intelligent". Sorry folks but neural nets are about as intelligent as regular ol' expert systems. They do not even remotely begin to approach the 'intelligence' level of your average rat, except that they can be hyper-optimized to find very good solutions in a very limited problem domain. They also tend to give a large quantity of false positives outside their limited problem domain (e.g., if you have a neural net that can identify pictures-of-tanks-on-fields vs pictures-of-no-tanks-on-fields, then if you feed it pictures-of-cars-on-highways it or pictures-of-hummingbirds-near-flowers it will just give you wacky answers like "well that brownish hummingbird near a tulip is DEFINITELY a tank in a field, but that blue hummingbird on the trellis is definitely NOT a tank in a field").
Your car analogy is also off. A better one would be to consider whether a parent needs a device in the car to prevent the child from driving it to places the parent does not approve.
All that said, I have nothing against the market providing filtering software to parents or employers.
"Rub her feet." -- L.L.
The RuleSpace knowledge base is going to be very attractive to a lot of corporate users," said Bill Gassman, an analyst at the Gartner Group. "Their list will find its way into corporate America. They'll figure: 'If AOL is using it for their members, it's got to be reasonably good.'"
Just curious, but wont this have exactly the opposite effect on the geek comunity?
Smuffe
Setting a few things straight here:
I don't give a fuck about Karma. I'd rather you mod me down instead of trying to search for reasons not to.
as you allready mentioned, you agree with my basic message, and yes my arguments were too strong or imprecise. But that wouldn't make this post flamebait just yet.
When I mentioned lawvoting for instance, I was aware of the fact that laws don't work in the same way everywhere else. I didn't mean it specifically, I meant that parents should use (as in "change") the legal system in general wherever possible to protect children from mental abuses. And since law is what makes democracy work, aside from all the obvious sarcasm, you have to take the downsides with that. The fact that laws are a contemporary reflection of a society is not a downside imho. The fact that they are very static is, and if you'd let me run the world I'd go for self regulation wherever possible too, but that's not how it works today, otherwise ISP's would never have gotten this far.
You also seem to think that I implied that ISP's should be the ones controling the content, but that is not my opinion by far. The problem is mentality, value degeneration and social acceptance of extravaganza and decadence, because hey, we're supposed to be modern kapitalists. Not that I don't want to be modern, but that doesn't have to mean there should not be any limits to what people are putting online for hard cash. Because no one else other than parents are contesting those actions, that's where the initial reaction should also start, not on any other level. But here AOL blurrs the lines ofcourse, because it's so huge and counts so many households. I can partly understand it but I still think any kind of censory is bad.
I don't think practically excercising sex in classrooms is going to do the trick here (nice try though), just like parents probably are only giving kids half the stuff they need to know. Kids find out a whole lot by themselves, just from watching tv which screams "sex, anger and violence" every evening. Imho they'll educate themselves more than anyone dares to say out loud. What is needed is a stable and comfortable environment of schools, parents, friends to explore what relationships are, that raises questions to questionable issues (shape limits in the head of the child, rather than set them for the child -> it's still his world!!), and encourages mental stability for the child..
Even if the puritunic movement has historic roots, it's basicly wrong and leads to mental abuse. The fact that AOL seems to feel a need to play that cultural shared opinion of US citizens can only mean AOL is desperately looking for new clients in the (elder) republican wing. So as allways, this isn't really in the interest of kids or parents, but in the interest of AOL itself. And presto, there you have your kapitalistic value degeneration again..
Aside from my dreadfull spellin I hope this time everything is right-on target.
With great power comes great electricity bills.
It's always mystified me a little bit how some people get so worked up about these little gizmos. Some people decry the lack of strong parenting. Others say the technology isn't perfect.
Well, give me a break. To measure anyone's strength as a parent by a little software agent is ridiculous. And here's some news: there is no technology that's perfect. The question is, are there people who will use this program? The second question is, does it work well enough for those who want to use it? If it doesn't work well enough, the company making it will go under, and I'll bid them good riddance. But if it is good enough, more power to them.
On the topic of this article specifically, what's the big deal? Oooh, neural nets. I'm sure it works better; otherwise, they wouldn't be using it. But it's a relatively small step in technology, in an application that (IMHO) doesn't bear much discussion.
It's just a machine, like a traffic light. Traffic lights can cause people to get sloppy about their driving, if they trust them too much; and they're not perfect, but they work well enough that they're worth using. And remember, we don't use them at all intersections.
Extending the metaphor, this article on using neural nets is akin to the use of delayed greens to reduce collisions at intersections. It can work better; it's good; but it's not worth much discussion.
--
Accountability on the heads of the powerful.
Power in the hands of the accountable.
yeah, i know. what i meant was you would "inherit" the default list from AOL HQ (or whatever), and then modify your own PC list as you saw fit. Those self-mods would feed back into the master list and alter the settings (maybe). But whether it acutally caused a change in the master settings or not, your kids would still be able(/unable) to view sites according to the local list, not the master. the neural net should only be used for "default" settings, in my opinion. and i think that using neural nets for this is a pretty damn good idea. but not if it still removes local control from individual parents (as opposed to aggregate parents...).
my crack about being governed by "average" AOLers was meant as humour (kinda).
/bluesninja
That's i guess a decent solution for once. Not ideal though. It means that all AOL-ers are now governed by the will of the "average" AOL member. That's a scary thought.
Ideally, instead of voting sites up or down, why not just let everybody host their own list? Instead of "voting" for the privelege of allowing your kids to view a site, just let them. And vice versa.
/bluesninja
Essentially there would be local communities (churches, schools, etc) who made restricted lists available via the browser to anyone that "agreed" with their standards. That is, if you are a parent and you like the standards your church sets, you "subscribe" or download (or whatever) the church's list of "bad" sites.
In this scheme there's little to no mandating of someone else's standards (what AOL deems inappropriate), and you can decide what's right for your family, situation, children, morals, etc.
I've never heard more about this scheme but I am interested in it (though I have no kids to patrol). One of these days when I get done with my PhD I might try to implement this solution and see how it works out.
If you don't get the reference, click here.
It sounds like just a voting system on websites. I read the article, but I don't get where the neural net part comes in. Also, does it apply to whole domains or subdomains, directories?
--
--hongpong.com
...but I'm not a big fan of filtering software.
This kind of automated filter along with the manual control will certainly come in handy, although I would like to advice parents to leave some scope for indulgence so that the children are not desperate enough to fid other more complex ways of satisfying the quest for information in those fields. And it all wears of in time because the novelty of the situation no longer exists.
There's always sufficient, but not always at the right place nor for the right folks.
I still don't know why they don't create .xxx and .sex. Then make a law so all porn sites have to be on .xxx or .sex. Then block those! It would be absolutely flawless and easy. But noooooo. We have to do things the hard way. That's a major problem with the internet, the people using it know what's better for it than the people in charge.
The GeekNights podcast is going strong. Listen!
Some things (and not all that many) are best done by majority vote, but some things are better left to individual discretion.
OK,
- B
--
http://www.bradheintz.com/
- updated
What kind of policy will govern the board that chooses sites to filter based on member nominations? The web contains a lot of perfectly inoffensive material that conservative Christian parents find objectionable.
What if a health information site contains a small amount of information on sexuality or medicinal herbs? It wouldn't exactly qualify as sex & drugs, and I can't imagine AOL filtering that. However, what about a Wiccan or Pagan site that contains the same information? I could see 10,000 well-organized Christian AOL members sending in their votes.
Does anyone know what end user sees when a site is blocked by AOL? For example, if my parents are being over-protective and I want to look at some nudity, what message do I see when I try? "Sorry, your parents don't want you to see this kind of thing" or "This site has been blocked?"
Also, was anyone confused by this line?:
"This (AOL filtering technology) is (only) that good," Nunberg said.
Should those parens be taken as brackets? If not, how would that have sounded in the interview? :)
My Karma was at 49, then they switched to words. All that work for nothing!