Slashdot Mirror


Wikipedia Used for Artificial Intelligence

eldavojohn writes "It may be no surprise but Wikipedia is now being used in the field of artificial intelligence. The applications for this may be endless. For instance, the front of spam fighting is a tough one and it looks as though researchers are now turning towards an ontology or taxonomy based solution to fight spammers. The concept is also on the forefront of artificial intelligence and progress towards an application passing the Turing Test and creating semantically aware applications. The article comments on uses of Wikipedia in this manner: '"... spam filters block all messages containing the word 'vitamin,' but fail to block messages containing the word B12. If the program never saw B12 before, it's just a word without any meaning. But you would know it's a vitamin," Markovitch said. "With our methodology, however, the computer will use its Wikipedia-based knowledge base to infer that 'B12' is strongly associated with the concept of vitamins, and will correctly identify the message as spam," he added.'"

5 of 177 comments (clear)

  1. Artificial intelligence! by tcopeland · · Score: 3, Informative

    And all this time you thought it was just if and switch statements!

    Whenever someone claims that a program is semantically aware, be sure to reread Clay Shirky's article on the Semantic web.

  2. UMMMM wordnet? by Anonymous Coward · · Score: 4, Informative

    this kind of technique has been used for a while..

    http://wordnet.princeton.edu/

    and according to my source of AI, wikipedia http://en.wikipedia.org/wiki/WordNet
    (like all sophisticated software) has been in development since the mid eighties..

    WordNet® is a large lexical database of English, developed under the direction of George A. Miller. Nouns, verbs, adjectives and adverbs are grouped into sets of cognitive synonyms (synsets), each expressing a distinct concept. Synsets are interlinked by means of conceptual-semantic and lexical relations. The resulting network of meaningfully related words and concepts can be navigated with the browser. WordNet is also freely and publicly available for download. WordNet's structure makes it a useful tool for computational linguistics and natural language processing

  3. Re:The B12 example is horrible by tepples · · Score: 3, Informative

    Suppose somebody was trying to sell me a B12 bomber.

    Then your e-mail account's Bayes map would have the map (word B12 -> folder Aircraft) with a high probability, which would outweigh (word B12 -> article Vitamin -> folder Drug Spam).

  4. Not New, not newsworthy by Sub+Zero+992 · · Score: 3, Informative

    Anybody who has been working in the field of NLP (natural language processing) can do little more than snear at this story.

    The field of word sense exploration is one of the more mature areas of NLP, take a look at Princeton's WordNet database for an example [http://wordnet.princeton.edu/]. Using their word sense database (without referring to silly words such as "ontology") it has been possible - for years - to discover if two lemmas (thats "words" to you) are related in a particular way, or not related. Using wordnet it is possible to distinguish between antonyms and homonyms, thereby thwarting spammers who use words which sound like "viagra" - "niagra" and words which have opposite meanings.

    --
    They who would give up an essential liberty for temporary security, deserve neither liberty or security - Ben Franklin
  5. Re:Since when by timeOday · · Score: 4, Informative
    Since when a database + automated search (keyword patterns and relations) = artifical intelligence?
    What part of human/animal intelligence is not detecting, storing, and applying patterns and relations?