Developing a Vandalism Detector For Wikipedia

Posted by kdawson on Sunday February 28, 2010 @08:45AM from the false-positives-would-hurt dept.

marpot writes "In an effort to assist Wikipedia's editors in their struggle to keep articles clean, we are conducting a public lab on vandalism detection. The goal is the development of a practical vandalism detector that is capable of telling apart ill-intentioned edits from well-intentioned edits. Such a tool, which will work somewhat like a spam detector, will release the crowd's workforce currently occupied with manual and semi-automatic edit filtering. The performance of submitted detectors will be evaluated based on a large collection of human-annotated edits, which has been crowdsourced using Amazon's Mechanical Turk. Everyone is welcome to participate."

1 of 116 comments (clear)

Min score:

Reason:

Sort:

Re:Existing by marpot · 2010-02-28 09:38 · Score: 5, Informative

We have studied the accuracy of ClueBot, and found that (on a small corpus) it has very good precision (low falsy positive rate), but a very low recall (low true positive rate). (see: http://www.uni-weimar.de/medien/webis/publications/downloads/papers/stein_2008c.pdf) But the picture might look quite different on a large scale.