Developing a Vandalism Detector For Wikipedia
marpot writes "In an effort to assist Wikipedia's editors in their struggle to keep articles clean, we are conducting a public lab on vandalism detection. The goal is the development of a practical vandalism detector that is capable of telling apart ill-intentioned edits from well-intentioned edits. Such a tool, which will work somewhat like a spam detector, will release the crowd's workforce currently occupied with manual and semi-automatic edit filtering. The performance of submitted detectors will be evaluated based on a large collection of human-annotated edits, which has been crowdsourced using Amazon's Mechanical Turk. Everyone is welcome to participate."
Apparently, how their vandalism detector works right now is by automatically reverting any edits done by anonymous editors.
(And yeah, that's a bit sarcastic, but somewhat true.)
Welcome to Slashdot. Although everyone is welcome to contribute to Slashdot, at least one of your recent posts did not appear to be constructive and has been modded down. Please use TrollTalk for any test edits you would like to make, and read the welcome page to learn more about contributing constructively to this web site. Thank you.
The article on anti-vandalism bots had been recently vandalized when they were doing their preliminary research.
Conscience is the inner voice which warns us that someone may be looking.
I've had many more problems with admin abuse than vandalism. Vandalism is quick and easy to deal with. Admins are the biggest problem in Wikipedia editing; they have no accountability and abuse their power.
How about a log of each admin's activities, including reversions, bans, etc, and a way for non-admins to challenge actions (without spending countless hours in an appeal process worthy of a federal court).
Before any more detectors are rolled out, how about they come up with a workable definition of vandalism? And actually use it fairly, ethically and logically.
There's a great deal of evidence to suggest the current definition of "vandalism," is something a wikiadmin decides he just doesn't like, or disagrees with, or in some way interferes with his power-trip.
There is an art to Wikipedia abuse. If someone cites a Wikipedia article in some argument they're making, you can always just go to Wikipedia and edit the page so that they're wrong. But that's what a novice Wikipedia vandal does.
A pro knows to edit the article in a very subtle way, so that it looks like the person has poor reading comprehension. Let's say the person cites a Wikipedia article with a sentence like this, in order to support the argument that Colbert is a Democrat.
Although by his own account he was not particularly political before joining the cast of The Daily Show, Colbert is a self-described Democrat.[12][13]
This bears the mark of authority, because of the footnote subscripts that are already on it. (We can skip the step where we maliciously relocate them here.)
A novice might change it to this (correctly preserving the authoritative footnote superscripts):
Although by his own account he was not particularly political before joining the cast of The Daily Show, Colbert is a self-described Republican.[12][13]
It makes the person appear to be wrong- and the vandalism is obvious- like swapping Eurasia for Eastasia. There's no way he could have misread that.
But change it to this
Although by his own account he was not particularly political before joining the cast of The Daily Show, Colbert has even been described as a Democrat.[12][13]
and the person looks not only wrong, but plausibly wrong because it looks like he can't read. That's what makes successful Wikipedia vandalism an art.
A system like this has been implemented for the German Wikipedia. Almost everybody who has an account can verify articles to be vandalism-free, unless you are logged in you see the last verified version by default.
(+1, Disagree)
Whoever posted this clearly isn't aware of the actual work being done in the field. For instance, I was running a ___[thing]___ in _[year]_, and it wasn't new at the time. They've gotten much more sophisticated since then. Why are they so intent on reinventing the wheel? Do they not even realize that the wheel exists already? Why not just improve on it instead?
* * *
This looks like a useful template for the standard "why reinvent the wheel" Slashdot post; I hope you don't mind if I reuse it.
Officially, vandalism is defined as edits made in bad faith. If you are trying to improve the article but are an idiot (which includes people that don't realise their own bias), that isn't vandalism, it's just idiocy. It is only if you are editing with the intention of making the article worse that you are vandalising.
I edit wikipedia occasionally, and one thing I remove is unmotivated links to companies, or unnecessary mentioning of specific products. So yes, I consider it a case of vandalism. Since my edits are usually (always?) kept, I think most people agree. There is probably some policy about it, but I act on common sense there.
c++;
If I had mod points, I'd mod the parent up and the grandparent down. Seriously, almost everything in Wikipedia is transparent. Search the revision history and logs and look for the information you need. RTFM.
A lot of people on /. seem to derive very general opinions about admins from a personal disappointing encounter. They do not include diffs of their edits or their username. From my experience in most cases the guy who got reverted by an admin broke some kind of rule (and often enough they just got reverted by a regular non-admin, but they assume it was an admin). Instead of RTFM those people post as AC complaining generally about admins without providing any traceable cases of admin abuse. I know my opinion isn't very popular, but unless you give concrete examples your allegations are just FUD.
Case in point --- There is an article in Wikipedia about a certain country.
In that article, they blame their previous British colonial master for everything.
I tried to make some corrections to that article to make it more "neutral", and they changed it back within 10 minutes.
I tried again, and again they changed it back.
For the third time, I was warned by someone from Wikipedia (dunno if it's a volunteer or something) that I have no right to make any correction to that particular article anymore.
The "THEY" in question is the government of that country. They have a "cyber-patrol" group in charge of "online propaganda" and that Wikipedia article is one of their many lies, aka propaganda, they have put online.
Now, how do you define vandalism in this case?
Muchas Gracias, Señor Edward Snowden !