34 'Highly Toxic Users' Wrote 9% of the Personal Attacks On Wikipedia (bleepingcomputer.com)
Researchers used machine learning to analyze every single comment left on Wikipedia in 2015. An anonymous reader shares their results:
34 "highly toxic users" were responsible for 9% of all the personal attacks in the comments on Wikipedia, according to a research team from Alphabet's Jigsaw and the Wikimedia Foundation. They concluded that "significant progress could be made by moderating a relatively small number of frequent attackers." But at the same time, in Wikipedia's comments "less than half of attacks come from users with little prior participation; and perhaps surprisingly, approximately 30% of attacks come from registered users with over a 100 contributions. These results suggest the problems associated with personal attacks do not have an easy solution... the majority of personal attacks on Wikipedia are not the result of a few malicious users, nor primarily the consequence of allowing anonymous contributions."
The researchers "developed a machine learning algorithm that was able to identify and distinguish different forms of online abuse and personal attacks," reports Bleeping Computer, adding that the team "hopes that Wikipedia uses their study to build a comments monitoring dashboard that could track down hotspots of abusive personal attacks and help moderators ban or block toxic users." The paper describes it as a method "that combines crowdsourcing and machine learning to analyze personal attacks at scale."
The researchers "developed a machine learning algorithm that was able to identify and distinguish different forms of online abuse and personal attacks," reports Bleeping Computer, adding that the team "hopes that Wikipedia uses their study to build a comments monitoring dashboard that could track down hotspots of abusive personal attacks and help moderators ban or block toxic users." The paper describes it as a method "that combines crowdsourcing and machine learning to analyze personal attacks at scale."
were in regard to overly territorial Wikipedia moderators?
Build a Man a Fire, and He'll Be Warm for a Day. Set a Man on Fire, and He'll Be Warm for the Rest of His Life.
I'm not really sure this is a problem for Wikipedia, but the ABC guys seem to think so. But take a look at their methodology. "Crowdsourced" "Machine Learning" via proprietary website, after we removed "common comments" which they assume to be bots. I'm sure anyone using the same data set would be hard pressed to recreate their results. They are very fuzzy despite all the algorithmic pruning.
We use this data to train a machine learning classifier, experimenting with features and labeling methods
Isn't this what they're really testing? An unspecified machine learning with "features" and "labeling"? Absolutely bewildering.
Another common category I've found is the crazy old-guy. They'll be mid-60s or so, and is loaded up with conspiracy theories. Most are harmless and just bad at understanding what a reliable source is, but a handful edit like a whirlwind and bite the head off of anyone who disputes or reverts their edits.