Competition Produces Vandalism Detection For Wikis

← Back to Stories (view on slashdot.org)

Competition Produces Vandalism Detection For Wikis

Posted by timothy on Sunday September 26, 2010 @02:40AM from the citation-needed dept.

marpot writes "Recently, the 1st International Competition on Wikipedia Vandalism Detection (PDF) finished: 9 groups (5 from the USA, 1 affiliated with Google) tried their best in detecting all vandalism cases from a large-scale evaluation corpus. The winning approach (PDF) detects 20% of all vandalism cases without misclassifying regular edits; moreover, it can be adjusted to detect 95% of the vandalism edits while misclassifying only 30% of all regular edits. Thus, by applying both settings, manual double-checking would only be required on 34% of all edits. Nothing is known, yet, whether the rule-based bots on Wikipedia can compete with this machine learning-based strategy. Anyway, there is still a lot potential for improvements since the top 2 detectors use entirely different detection paradigms: the first analyzes an edit's content, whereas the second (PDF) analyzes an edit's context using WikiTrust."

14 of 62 comments (clear)

Min score:

Reason:

Sort:

20% with no false positives? by Dan+East · 2010-09-26 03:19 · Score: 3, Insightful

If the algorithm can detect 20% with perfection then that must constitute extremely low hanging fruit. That type of vandalism is just annoyance. It is so obvious that the end user readily recognizes it as such and can skip over it or revert the edit.
The real issue is disinformation, which is vastly more subtle. The only defense is fact-checking or seeking out references. If the algorithm is capable of recognizing that kind of vandalism then the developers should have the software writing all the articles in the first place, because it'd have to be pretty spectacular to manage that.

--
Better known as 318230.
1. Re:20% with no false positives? by bunratty · 2010-09-26 04:29 · Score: 2, Insightful
  
  Care to show us even one article where 99% of good edits are reverted? Remember, that will mean that over 99% of all edits are reverted.
  
  --
  What a fool believes, he sees, no wise man has the power to reason away.
2. Re:20% with no false positives? by Rhaban · 2010-09-26 06:54 · Score: 2, Informative
  
  Care to show us even one article where 99% of good edits are reverted? Remember, that will mean that over 99% of all edits are reverted.
  not if there are bad edits that are not reverted.
and the reversionists? by Anonymous Coward · 2010-09-26 03:20 · Score: 2, Insightful

The people who "own" a page with the assistance of powerful insiders and revert any changes to their "pet" pages, even spelling fixes or simple corrections to bad information?
Will edits of *those* insiders, who are ruining wikipedia for the rest of us, be flagged by the algorithm as vandalism?
1. Re:and the reversionists? by bunratty · 2010-09-26 04:30 · Score: 2
  
  Can you show us a page where any changes, even spelling fixes or simple corrections, are reverted?
  
  --
  What a fool believes, he sees, no wise man has the power to reason away.
100% effective method by robably · 2010-09-26 03:34 · Score: 3, Funny

Thus, by applying both settings, manual double-checking would only be required on 34% of all edits.
Or, you know, just keep applying the first setting that always correctly detects 20% of vandalism on the 80% that's left over, until there's nothing left. Problem solved.
1. Re:100% effective method by Zocalo · 2010-09-26 04:08 · Score: 3, Funny
  
  Otherwise known as Zeno's Dichotomy Paradox (often shorted to just "Xeno's Paradox", although he in fact suggested three).
  
  I suppose I should now go and vandalise the article to keep in the spirit of things. Hang on, I'm half way there...
  
  --
  UNIX? They're not even circumcised! Savages!
Manual double checking? by structural_biologist · 2010-09-26 03:36 · Score: 2, Interesting

I don't know where that 34% figure comes from for the manual double checking. The test set contains about 60% vandalism and 40% real edits, so I'll assume this represents the rate of vandalism on wikipedia. Now, consider a set of 1000 edits. 600 would be vandalism while 400 would be real edits. The second filter would catch 570 instances of real vandalism along with 120 false positives. Even if you used the first filter to automatically remove the 120 instances of vandalism it finds, you would still be left with a set of 450 instances of vandalism + 120 false positives to check. This means that you would have to sort through about 57% of the original edits in order to identify the 120 false positives.
There is a pretty simple heuristic by pieterh · 2010-09-26 03:45 · Score: 2, Interesting

This comes from personally maintaining some 200+ wikis on Wikidot.com.
There are two kinds of vandals: those in the community of contributors, and those outside it. The first class of vandals cannot easily be detected automatically but when a wiki is actively built, the community will easily and happily fix damage done by these. The second class are usually spammers and come along when the wiki is stale. They are easily detected by the fact that a long static page is suddenly edited by an unknown person. It's very rare to find a real edit happening late after a wiki has solidified. We handle the second type of vandalism trivially by getting email notifications on any edits.
Trick is, wikis (maybe not Wikipedia but then certainly individual pages) don't have random life cycles but go through growth and stasis.

--
My blog
top 2 by trb · 2010-09-26 04:08 · Score: 2, Insightful

Anyway, there is still a lot potential for improvements since the top 2 detectors use entirely different detection paradigms
This implies that the lower-scoring detectors are less valuable in terms of looking for sources of improvement. That's not true, and that wasn't stated in the paper's "Conclusions" section. If the lowest scoring detector finds 5% of the bad data, and it's a different slice from what the other detectors find, then that's quite valuable.
Machine learning - right by Animats · 2010-09-26 04:21 · Score: 4, Informative

Wikipedia already has programs which detect most of the blatant vandalism. Page blanking and big deletions are caught immediately. Deletions that delete references generate warnings. Incoming text that duplicates other content on the Web is caught. That gets rid of most of the blatant vandalism. It's not a serious problem on Wikipedia.
The current headaches are mostly advertising, fancruft, and pushing of some political point of view. That's hard to deal with using what is, after all, a rather dumb machine learning algorithm that has no model of the content or subject matter.
There already IS a competitive angle by Grimbleton · 2010-09-26 04:23 · Score: 2, Insightful

They already compete to be the first to revert edits they disagree with.
Hah, bout time. by OnePumpChump · 2010-09-26 06:15 · Score: 2, Insightful

4chan and Somethingawful have been having Wikipedia vandalizing competitions for years. (Usually, whoever's edit or fake article stays the longest wins.)
Rules can only get so much by tawker · 2010-09-26 06:20 · Score: 3, Informative

As the owner of the first vandalism reverting bot in mainstream use - http://en.wikipedia.org/wiki/User:Tawkerbot2 I guess I have a bit of perspective on the whole problem. Originally the bot was designed / created to auto revert one very specific type of vandalism, a user who would put a picture of spongebob squarepants into pages while blinking them (or squidward or some cartoon character) - that was pretty easy to get. Next we went to stuff like full page blanking, ALL CAP LETTER UPDATES and additions of a tonne of bad words, based on common vandalism trends (ie, if a page had 0 profanity on it and someone added a few words it would be reverted, again, not too many false positives. That basically caught the "dumb kid" type of vandalism, and it was amazing how much lower a percentage it caught of total edits when students went back to school. The only problem, at the time, it was a resource pig. The bot was originally running on a P2 300MHz w/ a grand total of 256MB of RAM and the load got to be so high that we had to move it about 5 times. It's interesting to note that at first, many many people were opposed to the idea of automated vandalism revision, it was almost a contest to revert stuff first - and the bot would win a vast majority of the time. However, as time went on, my inbox started getting rather full whenever I had a power outage, cat knocked the cord out of the box hosting it etc. Community reaction to bots doing the grunt work in vandalism really changed. Anyways, just my 2c on it, and just for the heck of it to prove I'm actually the Tawker on wiki, http://en.wikipedia.org/w/index.php?title=User%3ATawker&action=historysubmit&diff=387163504&oldid=268687392