Slashdot Mirror


Vandalism Detection Contest Sponsored For Wikidata (wsdm-cup-2017.org)

Remember when Bing Maps lost a city because they used bad Wikipedia data? An anonymous Slashdot reader writes: Since knowledge bases like Wikidata are poised to be integrated into all kinds of information systems, wrong facts are not just displayed on Wikidata's pages but may propagate directly to all systems using the knowledge base. Hence, detecting and reverting vandalism and other kinds of damaging edits is an even more important task than on Wikipedia. Recently, German scientists published the first machine learning-based approach on vandalism detection in Wikidata, and now Adobe sponsors a competition on vandalism detection, the WSDM Cup Challenge, awarding $2500 for the best-performing solutions that will also be published open source.
"Given a Wikidata revision, compute a vandalism score denoting the likelihood of this revision being vandalism (or similarly damaging)," read the official rules, pushing for a near real-time solution to be submitted before December 22. And the winners will also be invited to the headquarters of Wikimedia Germany to discuss implenting their solutions.

38 comments

  1. Vandalism really? by Mashiki · · Score: 2

    Wikipedia has a bigger NPOV problem with their articles these days then vandalism. Especially because of people camping, or the variety of meat puppets that banned editors use to push agendas.

    --
    Om, nomnomnom...
    1. Re:Vandalism really? by ArmoredDragon · · Score: 1

      Wikipedia can be good for looking up things about natural sciences like biology, chemistry, physics, astronomy, etc, but for anything else it's often missing information, sometimes deliberately, and in cases where a page manages to not get deleted, it's heavily biased and one sided. Case in point:

      https://en.wikipedia.org/wiki/...

      In fact it's almost a miracle that the page even exists as it has met wikipedia's notability standards for years, yet it is often deleted and blocked from being reposted. And then in the talk page an admin complains that it doesn't have enough content (gee, I wonder how that happened?) hinting that it could get deleted.

      Disclaimer: I'm not part of the MGTOW movement.

    2. Re:Vandalism really? by ShanghaiBill · · Score: 1

      Case in point:

      https://en.wikipedia.org/wiki/...

      Sorry, but I don't see anything wrong with that page. It certainly describes a real phenomena, that is going to get much worse as technology improves, and may even eventually be a threat to humanity's existence. Many guys would be willing to replace their GF with a sexbot. Have you ever seen the TV show "Humans"? Watch a few episodes, and then ask yourself: If you had to chose, would you rather live with Laura or Anita?

    3. Re: Vandalism really? by Anonymous Coward · · Score: 0

      I'm afraid I don't understand the question.

    4. Re: Vandalism really? by Anonymous Coward · · Score: 0

      I'm afraid I don't understand the question.

      The question presumes you have seen the show. Laura and Joe are a married couple, with three kids, and a humanoid household robot named Anita. Laura is often away on business, and when she is home she spends most of her time complaining about various things and being depressed. Anita has the form of an attractive young woman. She cooks gourmet meals, keeps the house spotless, engages in pleasant (but shallow) conversation, and can (optionally) provide "adult services".

      The show is free on Netflix and Amazon Prime.

    5. Re:Vandalism really? by NotInHere · · Score: 1

      We'll be killed by flying saucers firing lasers if we don't ban sexbots: https://vimeo.com/12915013

    6. Re: Vandalism really? by FatdogHaiku · · Score: 1

      ...and can (optionally) provide "adult services".

      I'm thinking that's a "browsing history" that needs clearing before Laura gets home...
      Also, are some parts of Anita dishwasher safe?

      --
      You have the right to remain sentient. If you give up the right to remain sentient, you will be elected to public office
    7. Re: Vandalism really? by ShanghaiBill · · Score: 1

      I'm thinking that's a "browsing history" that needs clearing before Laura gets home...

      Indeed. A big part of the plot is when Joe fails to clear that history with a detrimental effect on his marriage. He tries to explain that a man using a sexbot is no different in principle than a woman using a vibrator, but that analogy was not convincing.

      Also, are some parts of Anita dishwasher safe?

      Better than that. After the "action", she simply excuses herself and goes to the bathroom to clean herself up.

    8. Re:Vandalism really? by FunkSoulBrother · · Score: 1

      According the Wikipedia deletion log, that page was deleted once (in 2007), and the similar page MGTOW was deleted a few times after a contentious debate in 2006/2007. In the end, it is 2016 and the article is there. It seems like the process won?

      I'm not a part of this movement either, but from the little I've gleaned from it on the internet over the years, it was a pretty new thing in 2006 (maybe not the concept, but the 'MGTOW' group/logo/etc.) Maybe the sources back then really did suck, and it was just needed a bit of time to boil into the cultural zeitgeist?

    9. Re: Vandalism really? by FatdogHaiku · · Score: 1

      Dammit, now I got something else to binge watch!
      Thx. :)

      --
      You have the right to remain sentient. If you give up the right to remain sentient, you will be elected to public office
    10. Re:Vandalism really? by Anonymous Coward · · Score: 0

      Wikipedia has a real problem with low editorial quality but that is probably the weirdest example of Wikipedia inadequacy that could be pointed out. If anything that article points out a strength of Wikipedia, specifically Wikipedia's breadth. Good luck finding another encyclopedia that is anywhere as informative on whatever that article is about. Granted I still don't know what the heck the article is describing as I have an aversion to childish slap fights over what is seen as politically correct so I am still wholly disinterested in the material. If I wished to take time contemplating the subject then GOING TO THE SOURCES in the article would likely produce enough material to begin to comprehend the subject. And that is pretty much what Wikipedia is good for.

    11. Re:Vandalism really? by ShanghaiBill · · Score: 1

      According the Wikipedia deletion log, that page was deleted once (in 2007), and the similar page MGTOW was deleted a few times after a contentious debate in 2006/2007.

      This aggressive deletionism is the reason I stopped contributing to, and no longer donate to Wikipedia. If you are not interested in a particular topic, then DON'T READ ABOUT IT. But there is no reason to delete it just to spite the people that ARE interested. A paper encyclopedia has to be selective because paper has a significant cost, and shelf space is limited. But an online encyclopedia does not have those constraints. Even if Wikipedia was ten, or even a hundred times bigger, the cost of the disk space would be negligible.

    12. Re:Vandalism really? by Anonymous Coward · · Score: 0

      This really just means that Wikipedia should fork. Fork into deletionist and inclusionist versions. Not sure why this is even an issue. The two camps could then trade content back and forth as desired for their own product. Naturally, the inclusionist camp would be bigger. Personally, I'm more interested in the inclusionist version.

      IMHO, there are other reasons WP should fork as well. With appropriate infrastructure, I don't see why everyone can't have their own version of Wikipedia, a la git or even Project Xanadu. It will splinter anyway once the tech becomes simple enough to manage on an individual basis.

  2. Where's the by ChoGGi · · Score: 1

    mod abuse contest?

  3. Easy! by Anonymous Coward · · Score: 0

    flag all edits as vandalism and have them reviewed by a human

  4. Authoritarianism does not valid data by shawnhcorey · · Score: 1

    The most reliable database ever created is our scientific knowledge. And it only got that way because of some basic rules: 1. Anyone can do science. 2. Anyone can debate science. 3. All evidence must be repeatable and repeatedly verified. Any database that does not follow scientific methodology will always be susceptible to containing bad data.

    --
    Don't stop where the ink does.
    1. Re:Authoritarianism does not valid data by Sique · · Score: 2
      Any database will always be susceptible to containing bad data. Even those that follow the scientific methology. Any data is only preliminary, and will be thrown out until better data comes in. What you totally ignore is how to determine which of two conflicting data points is more close to be real. Wikipedia doesn't do research. That's one very important concept of Wikipedia: no original research. If the people doing the original research are losing interest in Wikipedia, or are run over by a bus, Wikipedia loses any reference point for the data they entered. So what you get is stale data and no way to find out if it is both valid and relevant. If you ban original research from Wikipedia, you at least can vouch for the relevance of the data by checking if research is still going on outside of Wikipedia, and if you can find someone to keep the Wikipedia data up-to-date.

      So your blurb about the scientific method is irrelevant for Wikipedia, as Wikipedia is just a mirror of what happens outside of it. You need other criteria to determine which data in Wikipeda to keep and which data to throw out. Checking for possible vandalism is just one of the methods to throw out irrelevant data and to keep relevant data that got overwritten by vandalism..

      --
      .sig: Sique *sigh*
    2. Re:Authoritarianism does not valid data by shawnhcorey · · Score: 1

      You have just stated the biggest issue with Wikipedia: it is not self-correcting. If Wikipedia was started 2000 years ago, it would still state that the Earth was the centre of the universe because all experts agreed with it. New ideas like a solar system would be labelled vandalism.

      --
      Don't stop where the ink does.
    3. Re:Authoritarianism does not valid data by Anonymous Coward · · Score: 1

      Who needs original research when you can just park on an article and decide what is worthy of being cited or not? Just include whatever citations support your point of view, delete citations you find disagreeable, and throw acronyms and reverts at other editors until they give up.

    4. Re:Authoritarianism does not valid data by K.+S.+Kyosuke · · Score: 1

      It's not vandalism to publish new research outside of Wikipedia.

      --
      Ezekiel 23:20
    5. Re:Authoritarianism does not valid data by Sique · · Score: 1

      Not necessarily. You could have a second article about a heliocentric system, and maybe a third one discussing the merits of a geocentric and a heliocentric system. Just keep the original article about the heliocentric system intact! There is no reason to vandalize it just because you have a second possible description of the events in the sky. In fact, it took about 250 years between Copernicus's de revolutionibus and Isaac Newton's theory of gracvity, which finally allowed the heliocentric system to catch up in prognostic accuracy to the geocentric one! For those 250 years, simply writing "it's wrong!" into the article about the geocentric system indeed would have been vandalism. Until about 1700, the geocentric system, albeit horribly complicated with its cycles and epicycles and without any real explanation why they are needed, was simply the better prognostic tool for astronomy.

      --
      .sig: Sique *sigh*
    6. Re:Authoritarianism does not valid data by Mashiki · · Score: 2

      Not necessarily. You could have a second article about a heliocentric system, and maybe a third one discussing the merits of a geocentric and a heliocentric system. Just keep the original article about the heliocentric system intact!

      That's a fair point, however under today's rules at wikipedia, along with the cock-gobbling edditors. Your topic on helocentric systems would likely be flagged for deletion because it's non-notable(akin to denialism), or doesn't conform to the ruling form of orthodoxy. The sources regardless of whether or not they're factual, would suddenly be marked as unreliable, even if they had provable baseline statistical models with the peer reviewed data to back it up.

      Wikipedia simply needs to be purged of all editors and the foundation at this point with a full start over. It doesn't help the ol' Jimbo prefers to wash his hands of everything while saying "I need monies..."

      --
      Om, nomnomnom...
    7. Re:Authoritarianism does not valid data by Anonymous Coward · · Score: 0

      Heliocentric model is not notable and is promoted by fringe non-scientists, see WP:UNDUE. Deleting.

  5. If Adobe is involved. . . by smooth+wombat · · Score: 0

    You can be sure whoever wins will have the most convoluted code possible which works but only on the third day of the week when the moon is full

    Adobe can't even get their own code to work activating its own products on their own servers. What makes people think they're qualified to know if this code is good or not?

    --
    We will bankrupt ourselves in the vain search for absolute security. -- Dwight D. Eisenhower
  6. Facebook is worst by Anonymous Coward · · Score: 0

    Facebook has lots of cities, towns, regions and adresses wrong. And worst of all: no way to correct it. Facebook ignores all comments an no way to contact anyone at facebook.

  7. This is an example of 'true' democracy failing by Anonymous Coward · · Score: 0

    You need representative governing, you don't want to give the power of editing or execution to just anybody anywhere. Libertarianism doesn't work.

  8. So they want to improve Narrative Enforcement? by sethstorm · · Score: 0

    All Wikipedia wants to do is get help in enforcing their left-wing narrative - by considering truth as counter-narrative material that must be purged.

    --
    Twitter supports and protects racists - by smearing their critics with the "Hate Speech" label.
  9. Re:What is vandalism by sittingnut · · Score: 1, Troll

    anybody who relies on wikipedia for anything important must be an ignorant idiot.
    writing that truth in wikipedia article on wikipedia, would be counted as vandalism in wikipedia.

  10. Re: What is vandalism by Anonymous Coward · · Score: 0

    Articles blaming humans for "climate change" = vandalism.

  11. Racist Summary by chuckugly · · Score: 1

    My people, the Vandals and their various descendants, had nothing to do with this as a race and feel the summary and implication is beneath the fine people who run Wikipedia. Not even a trigger warning!

  12. Jesus, Don't They Teach This in Business School? by RobotRunAmok · · Score: 1

    Because I know they teach it in high school: Do not rely on Wikipedia for anything more than starting you in more or less the right direction. To build a business or even a business plan around the accuracy of the content in Wikipedia is ludicrous.

    Help Wikipedia "revert vandalism"? WTF?? Wikipedia is like that Mom&Pop Bookstore where they scream at you not to let the cat out when you open the door... "Dude, I wandered in to browse some books, if I can't do that without you stressing me I'll go shop someplace else!"

  13. Re:What is vandalism by Anonymous Coward · · Score: 0

    anybody who relies on wikipedia for anything important must be an ignorant idiot.
    writing that truth in wikipedia article on wikipedia, would be counted as vandalism in wikipedia.

    Anybody who relies on Slashdot for anything must be an ignorant idiot. TFTFY

    Which makes you an ignorant idiot, with an ironically accurate psuedonym and a long history of posting stupid (do you work for Microsoft? advise Trump on policy?)

    For the record - some fool employed by that company of fools called Microsoft, copied that latitude and longitude from Wikipedia but failed to copy the "-" for the latitude. The wikipedia information was correct.

    This story is a big lie

    . Classic Microsoft bullshit.

  14. Prize by Anonymous Coward · · Score: 0

    $1500 is the biggest prize Adobe could come up with? I get paid for this shit and that's a lot of work.

  15. Sometimes the "bots" are actually the vandals by Anonymous Coward · · Score: 1

    I've run into this a few times. Make an edit, and some bot comes by and vandalizes it.

  16. Re:What is vandalism by sittingnut · · Score: 1

    "Anybody who relies on Slashdot for anything must be an ignorant idiot. TFTFY

    Which makes you an ignorant idiot, with an ironically accurate psuedonym and a long history of posting stupid. do you work for Microsoft? .... Classic Microsoft bullshit."

    what are you talking about?
    my comment history will indicate that i do not like m$, if nothing else.
    (btw i have "excellent" karma here, and what do you mean i rely on slashdot? where? i have been very critical of editors here)

    clearly you have a problem verifying facts, and prefer to make silly accusations regardless of facts. no wonder you jumped to defend wikipedia.

  17. So they want to improve Narrative Enforcement. by sethstorm · · Score: 1

    All Wikipedia wants to do is get help in enforcing their left-wing narrative - by considering truth as counter-narrative material that must be purged.

    Never mind that the SOCJUS contingent of Slashdotters prove my point by modbombing anything that counters their narrative, especially truth.

    --
    Twitter supports and protects racists - by smearing their critics with the "Hate Speech" label.