Wikipedia Used for Artificial Intelligence

← Back to Stories (view on slashdot.org)

Wikipedia Used for Artificial Intelligence

Posted by Zonk on Sunday January 7, 2007 @06:25AM from the great-it-has-finally-become-self-aware dept.

eldavojohn writes "It may be no surprise but Wikipedia is now being used in the field of artificial intelligence. The applications for this may be endless. For instance, the front of spam fighting is a tough one and it looks as though researchers are now turning towards an ontology or taxonomy based solution to fight spammers. The concept is also on the forefront of artificial intelligence and progress towards an application passing the Turing Test and creating semantically aware applications. The article comments on uses of Wikipedia in this manner: '"... spam filters block all messages containing the word 'vitamin,' but fail to block messages containing the word B12. If the program never saw B12 before, it's just a word without any meaning. But you would know it's a vitamin," Markovitch said. "With our methodology, however, the computer will use its Wikipedia-based knowledge base to infer that 'B12' is strongly associated with the concept of vitamins, and will correctly identify the message as spam," he added.'"

10 of 177 comments (clear)

uh oh, there goes wikipedia by ILuvRamen · 2007-01-07 06:32 · Score: 4, Interesting

don't you think masses of spammers are going to screw with wikipedia strategically on purpose so that it doesn't work properly for that if it starts to work very well to block them? They should just stop being afraid of being called racist and super-filter every e-mail that comes out of South Korea, Indonesia, and especially Nigeria, etc. Type spam map into google image search to see how blatently obvious it is to see where the spam comes from. Something like 98% of spam can be pinned down to 0.01% of the world by square footage. If they added fuzzy logic instead of alterable AI and only block e-mails from south korea with the word vitamin and not block ones from Nebraska with the word vitamin, then the problem would be decreased dramatically.

--
Google's Super Secret Search Algorithm: SELECT @search_results FROM internet WHERE @search_results = 'good'
1. Re:uh oh, there goes wikipedia by WilliamSChips · 2007-01-07 06:50 · Score: 4, Insightful
  
  You don't think there are hundreds of thousands of zombifiable computers in the United States? And what about people with business connections in China or Korea?
  
  --
  Please, for the good of Humanity, vote Obama.
2. Re:uh oh, there goes wikipedia by ScentCone · 2007-01-07 07:04 · Score: 5, Interesting
  
  You don't think there are hundreds of thousands of zombifiable computers in the United States?
  
  Um, so? That doesn't make it inappropriate to block traffic from places where the overwhelming majority of the packets are toxic. It's a system-by-system, admin-by-admin judgement call, but there's no question that Korea isn't doing nearly enough to stop this problem locally. If the local culture starts to realize that they're isolating themselves from large sections of the internet because they won't do something to prevent 99% of their outbound mail from being spam, then maybe the need to filter will also go away.
  
  And what about people with business connections in China or Korea?
  
  I have a lot of customers with contacts like that. All of them (their Asian contacts) use Yahoo, Gmail, and similar accounts specifically to avoid this problem. Businesses in China and Korea are totally aware that most ISPs in those areas have poisoned outbound SMTP relays and user desktops. Or, they host their western-facing mail servers with providers in the west - I see a lot of that, too, since many of those businesses have two separate messaging platforms for the different international audiences with whom they communicate.
  
  --
  Don't disappoint your bird dog. Go to the range.
3. Re:uh oh, there goes wikipedia by Mr+Chund+Man · 2007-01-07 07:47 · Score: 5, Interesting
  
  Spam Map
  
  "South Korea, Indonesia, and especially Nigeria, etc"
  While we're at it, why not block Alberta, California, North Carolina, Virginia, Colorado, Oklahoma, Kansas, Vermont, New Hampshire, Massachusetts, Spain, France and Portugal - all spam hotspots according to the map cited? What's that, you receive email from people in these places? Tough titties, if we're to block email coming from spam hotspots as you say.
  
  Also, you've managed to point a finger of blame at Indonesia and Nigeria who are saintly in comparison to some more developed nations. Go racism!
Nothing new here... by Bodrius · 2007-01-07 06:35 · Score: 5, Funny

This isn't new to Slashdotters...

For years, Slashdot posts have used wikipedia as a form of artificial intelligence.

--
Freedom is the freedom to say 2+2=4, everything else follows...
i prefer by macadamia_harold · 2007-01-07 06:38 · Score: 4, Funny

For instance, the front of spam fighting is a tough one and it looks as though researchers are now turning towards an ontology or taxonomy based solution to fight spammers.

I think it would be much more effective if we used a taxidermy-based solution to fight spammers.

--
Push Button, Receive Bacon
UMMMM wordnet? by Anonymous Coward · 2007-01-07 06:50 · Score: 4, Informative

this kind of technique has been used for a while..

http://wordnet.princeton.edu/

and according to my source of AI, wikipedia http://en.wikipedia.org/wiki/WordNet
(like all sophisticated software) has been in development since the mid eighties..

WordNet® is a large lexical database of English, developed under the direction of George A. Miller. Nouns, verbs, adjectives and adverbs are grouped into sets of cognitive synonyms (synsets), each expressing a distinct concept. Synsets are interlinked by means of conceptual-semantic and lexical relations. The resulting network of meaningfully related words and concepts can be navigated with the browser. WordNet is also freely and publicly available for download. WordNet's structure makes it a useful tool for computational linguistics and natural language processing
Not very "intelligent" by iamacat · 2007-01-07 07:28 · Score: 4, Insightful

There are lots of legit e-mails discussing vitamins, viagara or even penis enlargement, this post included.
Re:Since when by timeOday · 2007-01-07 07:54 · Score: 4, Informative

Since when a database + automated search (keyword patterns and relations) = artifical intelligence?
What part of human/animal intelligence is not detecting, storing, and applying patterns and relations?
Re:Wikipedia needs work for spam filtering.... by Metasquares · 2007-01-07 08:31 · Score: 4, Insightful

Infer too much and the false positive rate skyrockets, though...