Slashdot Mirror


Developing a Vandalism Detector For Wikipedia

marpot writes "In an effort to assist Wikipedia's editors in their struggle to keep articles clean, we are conducting a public lab on vandalism detection. The goal is the development of a practical vandalism detector that is capable of telling apart ill-intentioned edits from well-intentioned edits. Such a tool, which will work somewhat like a spam detector, will release the crowd's workforce currently occupied with manual and semi-automatic edit filtering. The performance of submitted detectors will be evaluated based on a large collection of human-annotated edits, which has been crowdsourced using Amazon's Mechanical Turk. Everyone is welcome to participate."

116 comments

  1. Existing by ShakaUVM · · Score: 3, Insightful

    Apparently, how their vandalism detector works right now is by automatically reverting any edits done by anonymous editors.

    (And yeah, that's a bit sarcastic, but somewhat true.)

    1. Re:Existing by Rik+Sweeney · · Score: 3, Interesting

      It's called Clue Bot. It's been known to revert vandalism in under 30 seconds :)

    2. Re:Existing by Anonymous Coward · · Score: 0

      [[citation needed]]

    3. Re:Existing by broken_chaos · · Score: 4, Insightful

      I'm assuming it's also known to revert good edits in under 30 seconds?

      Just thinking out loud here, but is raw speed of reversion really what should be bragged about, as opposed to accuracy?

    4. Re:Existing by Rik+Sweeney · · Score: 4, Informative
    5. Re:Existing by Ignorant+Aardvark · · Score: 2, Informative

      I'm not sure why he bragged about reversion speed. All that's really dependent on is your network connection. For one, your network connection has to be good enough to download, in real time, the diffs of all edits to Wikipedia. Most aren't.

      Anyway, a decision as to whether a given diff is vandalism or not needs to be made in a small fraction of a second, as there are dozens of edits coming in every second, and if you continuously fall farther and farther behind, you lose. Given an ideal network connection, vandalism should be reverted in a couple of seconds or so.

      I suppose there's some argument to be made for a large cluster of computers handling all edits on Wikipedia, each one spending up to a full second judging each individual edit, but the truth is that none of the algorithms currently in use for vandalism detection are nearly sophisticated enough to require so much computation time.

    6. Re:Existing by Anonymous Coward · · Score: 0

      mostly edits deemed to be fursecution

    7. Re:Existing by broken_chaos · · Score: 3, Informative

      Oh yes, it definitely hits a large number of false positives, presumably also 'fixed' within 30 seconds. For every one that goes reported (including the hundreds or thousands of archived reports), there must be many that go unreported, by 'non-Wikipedians' who edited a page with an error, and then went on their way. Or by people who didn't stick around to 'watch' that their edit doesn't get 'fixed' by an automated process...

    8. Re:Existing by Ignorant+Aardvark · · Score: 3, Informative

      The false positive rate on the anti-vandalism bots is a lot lower than you would think. The bots are written quite conservatively, take a lot of factors into account, and only pull the revert trigger when they are quite sure.

      It's the type II error rate that's pretty high. Unfortunately, that's not solvable without strong AI.

    9. Re:Existing by DamonHD · · Score: 3, Interesting

      Amazingly my small sample is to the contrary.

      I fix small errors of syntax/grammar/fact when I run across them, have never created an account, and almost all of my edits seem to stick.

      Rgds

      Damon

      --
      http://m.earth.org.uk/
    10. Re:Existing by marpot · · Score: 5, Informative

      We have studied the accuracy of ClueBot, and found that (on a small corpus) it has very good precision (low falsy positive rate), but a very low recall (low true positive rate). (see: http://www.uni-weimar.de/medien/webis/publications/downloads/papers/stein_2008c.pdf) But the picture might look quite different on a large scale.

    11. Re:Existing by marpot · · Score: 2, Interesting

      This is by far overestimated. Dependent on how elaborate your edit model ist, you can analyse edits live on a laptop.

    12. Re:Existing by Yvan256 · · Score: 2, Funny

      A Clue Bot, eh? I wonder what happens if you register the username "Colonel Mustard".

    13. Re:Existing by Phil06 · · Score: 0

      It should be easy to point out vandalism. Instead it is difficult and hackneyed. As long as this is the case, I am not going to go through the trouble when I find it.

      --
      "...and yet, I blame society" Duke - Repo Man
    14. Re:Existing by Anonymous Coward · · Score: 0

      It's true. I fixed a spelling error once - a real one, not one that depends on US/UK spelling differences - and it was reverted almost immediately. That made me not want to contribute to it any more.

    15. Re:Existing by beakerMeep · · Score: 5, Insightful

      The problem is not so simple though. You cant quantify something as subjective as vandalism. You cant reduce it to your mathematical formula no matter how statistically fancy your 6 page pdf is.

      I had a particularly nasty run it with cluebot where I removed large portions of spam from an article, only to have cluebot revert it back and put the spam back in. When I again removed the spam, some other editor strolled by and again put the spam back in because he trusted the bot more than humans and he didnt read the talk page where many had requested the removal of this spam. Finally, after a rather rude conversation with the human he realized he had no business reverting it. This person was a long time editor and contributor too but it just serves as an example that any criteria used to determine spam is based upon assumptions. Assumptions that it will be true in other cases and assumptions that others will agree with the classification.

      The whole point of Wikipedia is that it is a community edited encyclopedia. I have no interest in a computer edited encyclopedia. If people want to program bots to review an editor's work, perhaps we should program bots to write the work? Perhaps you can call it Botopedia. Furthermore, many of the bots ask you to report false positive to their personal pages off of Wikipedia's website on some other .com or .edu domain. They ask you to be accountable to them, but who are they accountable to? What's to stop spammers from programming bots to annoy editors as a phishing exercise?

      Now don't get me wrong though, if someone wants to use a bot to aid in finding vandalism, that would help. But if the system is so frail that Wikipedia cant exist without computer program editors, It may be time to revisit the system. As others have stated, pushing edits into a queue would be much more sane than direct to live edits.

      Editing bots are wrong for Wikipedia, and if they allow it they are letting go of their vision of community participation in favor of the visions (or delusions) of grand technological solutions.

      --
      meep
    16. Re:Existing by Ignorant+Aardvark · · Score: 1, Offtopic

      Which part is over-estimated? All I can speak on from experience is AntiVandalBot. I ran that on an Athlon XP 2500+ (which wasn't particularly amazing at the time). It wasn't the computation that was hard, it was the network usage of downloading the diff of every edit by a non-trusted user from the RC feed. I would not have been able to run it on any home Internet connection. Thankfully I was able to place my server on an unthrottled 100 Mbps dorm connection at the University of Maryland.

      I will grant you that highspeed Internet access has become a lot more widespread since 2006 (I personally have 25/15 FIOS), but at the time, there wasn't anything available residentially that could handle it.

    17. Re:Existing by v1 · · Score: 1

      A Clue Bot, eh?

      Every time I see that in this thread, my eyes substitute "Clue Bat" and it totally changes meaning of the post while still making some degree of sense, making it hard to filter out.

      The base problem here is going overlooked. There isn't one kind of edit they're trying to combat, there's several. And each requires a different approach because they are incompatible.

      1- Spam (monte pithon kind) : ok that's easy for the bots to get rid of. even very loose definitions are easy to code with a good catch/FP rate.

      2- Armchair Quarterback : good luck here. It's very easy for someone to change a fact by altering a word, a name, or a date, and probably the most reliable method to identify these is going by a reputation model, either a bit like slashdot does, or by comparing number of edits that acct has made that have not been reverted. These are edits that introduce mistakes by virtue of a sincere person trying to make a "good edit" but polluting the wiki as a result. of their poor memory or misunderstanding of the subject.

      3- Malicious : these are deliberate variations of (2) above, and are no easier for a bot to spot. The only difference a bot has in comparing them is that these are much more likely to occur on high profile targets (political or charged issue pages, the ones wiki protects from editing by random ppl as it is) whereas (2) will tend to occur on topics with a more obscure nature or on pages where you are fleshing out very picky details like a football player's stats back in the 90's. Syntactic scanning for specific grammatical changes (like changing the assertion by adding a "not") would help here because those are the easiest malicious edits to make that would survive a cursory glance.

      Bots only really have a chance dealing with (1) effectively. The best they can do for (2) and (3) is to simply have a high FP rate and cause real humans to have to take a look at them. IMHO a bot that identifies a probable 2/3 should NOT revert it, but instead should flag it for specific attention by a real editor. This will help take some of the load off the editors that normally task themselves with looking at any edits, rather than just the suspicious ones.

      --
      I work for the Department of Redundancy Department.
    18. Re:Existing by marpot · · Score: 1

      I cannot agree more with what you say, but I'd like to give it a twist: I want computers to assist me, and I want them to to it good, reliable, and robust. If I happen to be a Wikipedia editor that doesn't change a thing, I still want the computer to assist me with what I'm doing. Now, currently there is no such thing, and the only thing I'd like to foster research in doing so.

      Now, some always go ten steps further, when someone talks about a new "solution" based on computers. They directly envision a world where computers take over. And that, apart from being unrealistic today, must be considered ideological, instead of logical.

      After all, all you see here and all you see on Wikipedia is made possible only by machines working with intelligent algorithms.

    19. Re:Existing by marpot · · Score: 1

      Me too, experience that is. We tooke the feauteres from our research with high througput, and implemented a live edit analysis for the English portion of Wikipedia. It listens on the IRC channel, downloads edits wikitexts of old and new revision, and then does its magic. And it did so once on an old laptop. The computer was connected at max 1 GBit/s.

    20. Re:Existing by Tango42 · · Score: 2, Insightful

      In that paper, you say you think high-recall (ie. low false negatives) should be preferred to high-precision (low false positives) since it reduces the chance of a reader seeing a vandalised version. I disagree. You underestimate the harm caused by losing editors that get annoyed when their legitimate edits are reverted by a bot. The upcoming feature, Flagged Revisions ( http://en.wikipedia.org/wiki/Wikipedia:Flagged_revisions ), will provide a much better way of preventing readers from seeing vandalised versions while not costing us useful editors.

    21. Re:Existing by Big+Jojo · · Score: 1, Interesting

      Apparently, how their vandalism detector works right now is by automatically reverting any edits done by anonymous editors.

      I've seen signs of that too. Not always ... but often enough to have acquired a rather negative understanding of the role of some folk with admin privileges at WP. It's clear when they haven't even bothered to read (much less understand!) the edits they revert. Or that they just revert anything that offends an ideology they want WP to present on any particular topics. They think NPV shouldn't apply to their gloriously elevated selves. (And refuse to acknowledge when their ideology is showing.)

      That's on top of editors just flagging articles as sub-par but without saying specifically why, or responding to queries about WTF they meant. Not every article should consist of 50% citations and 50% content ... if you're going to say there aren't enough citations, just be specific about which statements you think need citations; that's easy to do. And maybe ... read the citations which are already there. Or even use the Talk: page appropriately, to discuss such issues, if you can't yet be specific enough to be actionable.

      The messages some admins give is that if you're not part of their particular club, Please Go Away. Some are even quite public that they object to edits from folk without accounts ... regardless of the content of those edits. Way too many obnoxious A**hats have admin privs there.

      How about letting us flag such editors/admins as comment spammers? It's not like their volume of vague and un-actionable criticisms, or inappropriate reversions, really helps improve WP. While unlike real spammers, their negative effects are actually hard to correct.

    22. Re:Existing by The+Wild+Norseman · · Score: 2, Funny

      I cannot agree more with what you say, but I'd like to give it a twist: I want computers to assist me, and I want them to DO it WELL, RELIABLY, and ROBUSTLY.

      -Slashbot Editor 0.95 beta

      --
      "A government is a body of people usually -- notably -- ungoverned." -Shepherd Book
    23. Re:Existing by Moryath · · Score: 0, Troll

      Finally, after a rather rude conversation with the human he realized he had no business reverting it.

      And if the editor had been one of wikipedia's "admins", he would have simply gone "ban. lock talkpage." And he'd have gone right on his merry way to abuse someone else.

      Now don't get me wrong though, if someone wants to use a bot to aid in finding vandalism, that would help. But if the system is so frail that Wikipedia cant exist without computer program editors, It may be time to revisit the system. As others have stated, pushing edits into a queue would be much more sane than direct to live edits.

      "Bots" are used for everything these days on wikipedia, and inevitably, they don't work right. The question of whether they "don't work right" so badly that even the one or two sane admins left call attention to it and lock them out, or whether they simply go on doing what they do (some were even programmed by the insane portions of the wikipedia admins to control certain pages), is predicated on the politics of whether a wikipedia admin supports it or not.

    24. Re:Existing by Eivind · · Score: 1

      I dunno. I find a bot okay, but it should be extremely conservative, because it's so bad if it reverts edits that are in fact made in good faith (even if the edits themselves are bad). It's possible Cluebot isn't conservative ENOUGH, but if you have a look at say the last 100 edits it's made, it's really hard to argue that they're not 99%+ bad-faith vandalism.

    25. Re:Existing by LQ · · Score: 1

      I'm an occasional "recent changes patroller" and I don't really care how many false positives cluebot gets in anonymous edits. It's too busy weeding out the thousands of "Bob is gay" and "I like pie" edits. Why they still allow anonymous "editors", I really don't know.

    26. Re:Existing by jonadab · · Score: 1

      Yeah, my edits generally stick too, and almost all of them are anon/IP, not because I haven't created an account, but because Wikipedia's session-timeout policy is so short that logging in seldom does any good. You can log in, but by the time you're ready to commit an edit, you're typically not logged in any more. I can't imagine what possessed them to make it so short. I've got better things to do than log in *again* each and every time I want to make an edit. So I usually don't bother. And it doesn't seem to matter.

      As for the type of edits I make, there's a lot of variety. I frequently fix punctuation, grammar, or syntax, but I've also done much more substantial editing when I thought it was warranted. I've even done a couple of wholesale rewrites, including one for an inherently somewhat controversial article (Abraham); it's seen a great deal of editing since, but the overall flow and outline of the article is much closer to what I instituted than to anything it had before. I've created stubs and redirects, removed unsourced content, added citations, added new sections, combined sections, added external links, removed obviously excessive requests for citation where they were clearly unnecessary, removed requests for cleanup when the cleanup seemed to have already been done, ... basically, whatever seems to be needed. I'm usually not logged in, but almost all of my edits stick. To my knowledge, I've never been reverted by a bot or administrator. In fact, the only instance I know about where I was reverted quickly was probably not deliberate. (The other editor in that case appeared to be using the whole-page edit button instead of section editing and apparently did not want to merge his own rather substantial edits with my spelling corrections, even though they were in an entirely different section.)

      --
      Cut that out, or I will ship you to Norilsk in a box.
  2. {{uw-vandalism1}} by Anonymous Coward · · Score: 5, Funny

    Welcome to Slashdot. Although everyone is welcome to contribute to Slashdot, at least one of your recent posts did not appear to be constructive and has been modded down. Please use TrollTalk for any test edits you would like to make, and read the welcome page to learn more about contributing constructively to this web site. Thank you.

  3. Been done? by Ignorant+Aardvark · · Score: 2, Informative

    Whoever posted this clearly isn't aware of the actual work being done in the field. For instance, I was running an anti-vandalism bot in 2006, and it wasn't new at the time. They've gotten gotten much more sophisticated since then.

    Why are they so intent on reinventing the wheel? Do they not even realize that the wheel exists already? Why not just improve on it instead?

    1. Re:Been done? by Kratisto · · Score: 4, Funny

      The article on anti-vandalism bots had been recently vandalized when they were doing their preliminary research.

      --
      Conscience is the inner voice which warns us that someone may be looking.
    2. Re:Been done? by marpot · · Score: 2, Informative

      We are very aware of the existing tools (Huggle, Twinkle, and so on). See the links in the above post, and see the links in the resources section of the competition Web page. An accurate vandalism detector will take a lot of research an development, just like spam detectors did... Why did you stop developing your tool, anyway?

    3. Re:Been done? by pipatron · · Score: 1

      Why are they so intent on reinventing the wheel? Do they not even realize that the wheel exists already? Why not just improve on it instead?

      Sometimes it's more practical to start from scratch. You might want to change the design from the ground up, and to do that with an already working bot would not be as constructive. The current bots are probably very well tweaked and polished, for their given design and methods of spam detection.

      --
      c++; /* this makes c bigger but returns the old value */
    4. Re:Been done? by LifesABeach · · Score: 2, Insightful

      One of the quirks I've noticed is when a business makes, or invents something, then uses the Wiki to advertise. I can't help but wonder is this could also be considered a form of vandalism?

    5. Re:Been done? by Tango42 · · Score: 1

      Huggle and Twinkle are tools to help humans deal with vandalism. AntiVandalBot and ClueBot, etc., are bots that deal with (the most obvious) vandalism themselves. They are very different things.

    6. Re:Been done? by marpot · · Score: 1

      Exactly, but both kinds of tools need to solve the same underlying problem: given an edit, is it vandalism? The better those tools answer this question, the more time of Wikipedia editors is saved.

    7. Re:Been done? by pipatron · · Score: 3, Interesting

      I edit wikipedia occasionally, and one thing I remove is unmotivated links to companies, or unnecessary mentioning of specific products. So yes, I consider it a case of vandalism. Since my edits are usually (always?) kept, I think most people agree. There is probably some policy about it, but I act on common sense there.

      --
      c++; /* this makes c bigger but returns the old value */
    8. Re:Been done? by Tango42 · · Score: 1

      No, they solve very different problems. Something like Huggle needs to work out if a given edit can be almost guaranteed *not* to be vandalism (usually because the editor is on a whitelist), everything else gets shown to a human. The important thing for something like Huggle is making it easy for humans to review edits, not judging the edits automatically in any way. Something like ClueBot needs to work out if it can almost guarantee that a given edit *is* vandalism. They are very different.

    9. Re:Been done? by cerberusss · · Score: 1

      One of the quirks I've noticed is when a business makes, or invents something, then uses the Wiki to advertise. I can't help but wonder is this could also be considered a form of vandalism?

      It's advertising if they use advertisement text. Recently, I edited the article on the LEON processor. It originally had texts like:
      "It offers all basic functions of a pipelined in-order processor, making it a good experimentation vehicle."

      "Making it a good experimentation vehicle" for who? What type of experiments? What is good? How is it measured?

      It's very interesting if the bot could see the difference between such texts.

      --
      8 of 13 people found this answer helpful. Did you?
    10. Re:Been done? by Marcika · · Score: 1

      I edit wikipedia occasionally, and one thing I remove is unmotivated links to companies, or unnecessary mentioning of specific products. So yes, I consider it a case of vandalism. Since my edits are usually (always?) kept, I think most people agree. There is probably some policy about it, but I act on common sense there.

      The policy is on the page Wikipedia:Spam, quite logically. It's probably one of the oldest official polices, given that it was already needed back in 2003...

  4. Wikipedia needs a Flash editor by Anonymous Coward · · Score: 1, Insightful

    Wikipedia, the encyclopedia that anyone can edit - in my ass.

    Harry Potter:
    "The novels revolve around [[Harry Potter (character)|Harry Potter]], an orphan who discovers at the age of eleven that he is a wizard.{{cite web|url=http://edition.cnn.com/2000/books/reviews/07/14/review.potter.goblet/|title=Review: Gladly drinking from Rowling's 'Goblet of Fire'|date=14 July 2000|publisher=CNN|accessdate=28 September 2008}} Wizard ability is inborn, but children are sent to wizarding school to learn the magical skills necessary to succeed in the [[wizarding world]]. Harry is invited to attend the boarding school called [[Hogwarts|Hogwarts School of Witchcraft and Wizardry]]. Each book chronicles one year in Harry's life, and most of the events take place at Hogwarts.{{cite news|url=http://www.newsobserver.com/308/story/639602.html|title=Harry Potter, Hogwarts and Home|last=Frauenfelder|first=David|date=17 July 2007|publisher=The News & Observer Publishing Company |accessdate=29 September 2008}} As he struggles through adolescence, Harry learns to overcome many magical, social and emotional hurdles.{{cite web|url=http://www.southflorida.com/movies/sfe-potter-synopses,0,6711375.story|title=Plot summaries for the first five Potter books|last=Hajela|first=Deepti|date=14 July 2005|publisher=SouthFlorida.com|accessdate=29 September 2008}}"

    "=== Supplementary works ===
    {{see also|J. K. Rowling#Philanthropy|l1=J. K. Rowling: Philanthropy}}

    Rowling has expanded the [[Harry Potter universe]] with several short books produced for various charities.{{cite web|url=http://news.bbc.co.uk/1/hi/business/6903111.stm|title=How Rowling conjured up millions|publisher=BBC|accessdate=7 September 2008 | date=19 July 2007}}{{cite web|url=http://www.alibris.com/search/books/qwork/1198169/used/Comic%20Relief%20:%20Quidditch%20through%20the%20ages|title=Comic Relief : Quidditch through the ages|publisher=Albris|accessdate=7 September 2008}} In 2001, she released ''[[Fantastic Beasts and Where to Find Them]]'' (a purported Hogwarts textbook) and ''[[Quidditch Through the Ages]]'' (a book Harry read for fun). Proceeds from the sale of these two books benefitted the charity [[Comic Relief]].{{cite web|url=http://www.comicrelief.com/stuff-to-buy/harrys-books/the-money/|title=The Money|publisher=Comic Relief|accessdate=25 October 2007}} In 2007, Rowling composed seven handwritten copies of ''[[The Tales of Beedle the Bard]]'', a collection of fairy tales that is featured in the final novel, one of which was auctioned to raise money for the Children's High Level Group, a fund for mentally disabled children in poor countries. The book was published internationally on 4 December 2008.{{cite web|title=
    JK Rowling Fairy Tales To Go On Sale For Charity|work=ANI|year=2008|url=http://living.oneindia.in/insync/2008/harry-potter-jk-rowling-charity-020808.html
    |accessdate=2 August 2008}}{{cite news|url=http://news.bbc.co.uk/1/hi/entertainment/7142656.stm|title=JK Rowling book fetches £2m|date= 13 December 2007|publisher=BBC|accessdate=13 December 2007}}{{cite web|url=http://www.amazon.co.uk/gp/feature.html?docId=1000137983|title=Amazon purchase book|publisher=Amazon.com Inc|accessdate=14 December 2007}} Rowling also wrote an 800-word [[Harry Potter prequel|prequel]] in 2008 as part of a fundraiser organised by the bookseller [[Waterstones]].{{cite web|title=Rowling pens Potter prequel for charities|author=Williams, Rachel |year=2008|publisher=''[[The Guardian]]''|url=http://www.guardian.co.uk/books/2008/may/29/harrypotter.jkjoannekathleenrowling}} Retrieved on 31 May 2008.

    == Structure and genre ==
    {{see also|Harry Potter influences and analogues}}

    The ''Harry Potter'' novels fall within the genre of [[fantasy literature]]; however, in many respects they are also [[bildungsroman]]s, or [[coming of age]] novels.{{cite web|url=http://findarticles.com/p/articles/mi_m0OON/is_1_24/ai_107896944|title=Wizards and wainscots: generic structures and genre themes in the Harry Potter series|last=Anne Le Lievre|first=Kerrie|ye

    1. Re:Wikipedia needs a Flash editor by MillionthMonkey · · Score: 2, Insightful

      On the other hand, do gibberish pages like this need much more editing, or is Harry Potter's Wikipedia entry basically finished as far as anyone cares?

    2. Re:Wikipedia needs a Flash editor by bertok · · Score: 1

      Wikipedia, the encyclopedia that anyone can edit - in my ass.

      Actually, it's possible to make a wysiwyg editor for Wiki markup in HTML with a little Javascript, there's no need for Flash!

      It's not even hard, I did one for a corporate project in about a week, and I'm by no means an expert at Javascript.

      It's even possible to do a split-screen view where it shows you the markup AND the preview, and the user can edit either.

      The trick is that doing this has a prerequisite: the wiki syntax has to have a nice unambiguous grammar, and you need a parser generator that can emit Javascript parsers for it. At first, I tried to base my wiki grammar on Wikipedia's syntax, but it turns out that it's ambigious and difficult to parse, so I made one from scratch, and used ANTLR to generate a parser for it. The actual project used the C# parser (it was for an ASP.NET web site), but I experimented with real-time parsing in web pages using the JS parser. It works well, and is fast enough for pages of about 1KB. It would need some clever programming to scale past that, like incremental parsing.

      This wouldn't work for Wikipedia though, because the way it has been written is typical PHP spaghetti code. It relies heavily on repeated "search & replace" operations and regular expressions, which sounds not-too-bad, until you have to figure out the formal grammar or the page object model. It's got layers of crap on top of each other, with no rhyme or reason. For example:

      Both of these are: '''bold''' <b>bold</b>

      You can have nested code too: ''italic <b>italic-bold''bold<b>

      This complex mess of Wiki markup, legacy HTML markup, and some XML-like elements makes parsing Wikipedia a royal pain. The nested syntax is especially nasty. Everyone else with any sense is moving toward XHTML-like syntax, where open tags have to be closed in a strict reverse sequence.

    3. Re:Wikipedia needs a Flash editor by Anonymous Coward · · Score: 0

      That would be a valid point if all articles started out neat and clean and only transitioned to an uneditable source-gibberish state when they were finished, settled and neutral. I don't think that's the case though.

    4. Re:Wikipedia needs a Flash editor by jgrahn · · Score: 1

      Wikipedia, the encyclopedia that anyone can edit - in my ass. [---] I am convinced that the current state of affairs is a conscious choice. The way to maximise 'insider power' and minimize 'outsider power' is to make editing as hard as possible, and the rules and traditions needed not to be revoked as many as possible.

      Yes. That was also the main driving force behind RUNOFF, troff, TeX, LaTeX, HTML and all other non-WYSIWYG systems back into the 1960s. It's a conspiracy.

      Seriously: no. It's just that it's the easiest system to work with, unless you are too lazy to learn a little bit of syntax. The HP text you quoted looks bad because (a) they didn't use line breaks to make the code readable and (b) OK, the system with citations inline in the text sucks.

  5. How about an Admin Abuse Detector? by Anonymous Coward · · Score: 3, Insightful

    I've had many more problems with admin abuse than vandalism. Vandalism is quick and easy to deal with. Admins are the biggest problem in Wikipedia editing; they have no accountability and abuse their power.

    How about a log of each admin's activities, including reversions, bans, etc, and a way for non-admins to challenge actions (without spending countless hours in an appeal process worthy of a federal court).

    1. Re:How about an Admin Abuse Detector? by Anonymous Coward · · Score: 1, Informative

      I've had many more problems with admin abuse than vandalism. Vandalism is quick and easy to deal with. Admins are the biggest problem in Wikipedia editing; they have no accountability and abuse their power.

      How about a log of each admin's activities, including reversions, bans, etc, and a way for non-admins to challenge actions (without spending countless hours in an appeal process worthy of a federal court).

      What are you talking about? All users have logs that track their actions:

      http://en.wikipedia.org/wiki/Special:Contributions/Jimbo_Wales
      http://en.wikipedia.org/w/index.php?title=Special%3ALog&type=&user=Jimbo+Wales&page=&year=&month=-1&tagfilter=

      Actions can be challenged at any point on the talk page or the administrator boards.

    2. Re:How about an Admin Abuse Detector? by OverlordQ · · Score: 4, Informative

      How about a log of each admin's activities, including reversions, bans, etc, and a way for non-admins to challenge actions (without spending countless hours in an appeal process worthy of a federal court).

      Reversions: http://en.wikipedia.org/wiki/Special:Contributions
      Bans: http://en.wikipedia.org/wiki/Special:Log/block
      Deletes: http://en.wikipedia.org/wiki/Special:Log/delete

      Anything else you're too lazy to find yourself?

      --
      Your hair look like poop, Bob! - Wanker.
    3. Re:How about an Admin Abuse Detector? by Hurricane78 · · Score: 1

      If you think about it, it’s not much different form a country with total censorship. This small establishment’s view always overrides over everybody else. And they massively make use of that power.

      As I said: As long as it is even possible for a subset of humanity, to control what’s going onto Wikipedia, it can by definition not be the encyclopedia for all of humanity.
      It’s obvious that to solve this, central servers and admins are out of the question... resulting in a P2P system of cascading trust relationships.

      --
      Any sufficiently advanced intelligence is indistinguishable from stupidity.
    4. Re:How about an Admin Abuse Detector? by Anonymous Coward · · Score: 1, Interesting

      I'm the OP.

      Anything else you're too lazy to find yourself?

      I recognize that voice anywhere; you must be a Wikipedia Admin. I've been editing Wikipedia for years, but didn't know about the second two lists (the first isn't really a list of reversions, but perhaps there's a way to make it work). If I don't, then I suspect many others don't.

      Which brings us back to my point: Those lists need to be part of a system -- an easily accessible, understandable system -- "for non-admins to challenge actions (without spending countless hours in an appeal process worthy of a federal court)." I don't have time to find and study every function, rule, and procedure on Wikipedia that might apply. The overhead of editing is so high -- primarily because of admin abuse -- that I've stopped doing it. The frustration of dealing with people who behave poorly doesn't help.

    5. Re:How about an Admin Abuse Detector? by Anonymous Coward · · Score: 0

      All users have logs that track their actions

      See my response to the other poster with a similar response. I wanted to add that Jimbo Wales is an interesting example. In the past he has removed things from logs simply because he doesn't personally like them (I seem to remember some discussion page debate about his birthday, for example).

      The problem with Wikipedia starts at the top; if Jimbo Wales is editing his girlfriend's page and abusing his power, he sets a standard that everyone will follow.

  6. What? by Anonymous Coward · · Score: 0

    How do we tell intent from the resulting content? Yes, clearly FUCK UR MOM and UR MOM SUCKS COCKS IN HELL are vandalism, but what about misinformed people making edits, is that vandalism? There is no bad intent there. What about people that doesn't understand wikipedia making edits that are non neutral. Is that vandalism?

    1. Re:What? by Ignorant+Aardvark · · Score: 2, Informative

      In response to whether those two examples are vandalism, the answer is no, they are not.

      You'd need a strong AI to be able to make those determinations, and if such a thing existed, it'd make more sense just to have the strong AI write the encyclopedia.

      What we're talking about here is obvious vandalism (blanking, insertion of curse words, etc.) of the type that can be detected by an algorithmic/heuristic program.

  7. Should work. Bogofilter for autotagging emails by Colin+Smith · · Score: 1

    Bayesian statistics are an interesting thing. Mwhwhwhwhaaaa. Who thought they would say that about stats?

    Anyway. you can tell spam with a remarkably high degree of accuracy... Guess what. You can tell "Important" and "friends" emails with a similar degree of accuracy (you define what's important or who are friends). No offence to most vandals (of any type), but usually they are complete fuckwits. I suspect they and what they write are probably even more predictable than spammers.
     

    --
    Deleted
    1. Re:Should work. Bogofilter for autotagging emails by EdIII · · Score: 1

      The goal is the development of a practical fuckwit detector that is capable of telling apart ill-intentioned posts from well-intentioned posts.

      You gave me a good idea....

    2. Re:Should work. Bogofilter for autotagging emails by Anonymous Coward · · Score: 0

      They are also profoundly limited by the published rulesets for particular tools. Take a good look at CRM114 for better filtering: Markovian filtering, fast, and the filtering is generated semi-randomly from the existing data. Tuning your spam to avoid particular filters is extremely difficult, because the filters are not pre-determined.

  8. Step One by owlnation · · Score: 4, Insightful

    Before any more detectors are rolled out, how about they come up with a workable definition of vandalism? And actually use it fairly, ethically and logically.

    There's a great deal of evidence to suggest the current definition of "vandalism," is something a wikiadmin decides he just doesn't like, or disagrees with, or in some way interferes with his power-trip.

    1. Re:Step One by gbjbaanb · · Score: 1

      It should be inaccurate revisions, however who is to say that a revision is inaccurate or not. We could have a panel of experts for each given topic, but that'd only work if you divided WP up into sections and had an admin sitting like a judge on each section.

      As a result: ""vandalism," is something a wikiadmin decides he just doesn't like, or disagrees with, or in some way interferes with his power-trip."

    2. Re:Step One by Anonymous Coward · · Score: 2, Insightful

      I completely agree. The worst vandalism on wikipedia is done by self righteous page owners and admins on power trips that hate to be corrected. I used to help out on a number of pages (areas where I am a genuine expert not just someone with an opinion) but having my updates constantly deleted just got too frustrating, now I just make sure people in my field know not to use wikipedia.

    3. Re:Step One by Homburg · · Score: 1

      There's a great deal of evidence to suggest...

      And yet you don't include any reference to this supposed evidence.

    4. Re:Step One by Nihiltres · · Score: 1

      Or, in other words, [citation needed]. (also, is [citation needed] a meme when discussing Wikipedia? ) There's a wide variety of material that will result in reverts or blocks that isn't really vandalism, though. Behaviour that's disruptive, trolling, a breaching experiment, etc. will elicit roughly the same response as vandalism, and that needs to be taken into account both for automatic vandalism-repair systems (should this process treat it as vandalism?) and for making the statement that vandalism is ill-defined or that it's used for corrupt purposes. My guess is that some people are lumping the disruptive behaviour, etc. into "vandalism" when it really ought to be labelled "trolling" or some such—the response is the same, but the semantics *sigh* (Disclosure: I am an admin on Wikipedia.)

    5. Re:Step One by Anonymous Coward · · Score: 0

      It should be inaccurate revisions, however who is to say that a revision is inaccurate or not.

      Yes, because "New York is bigger than your mom" is totally not vandalism. Or "It fucking sucks" is a legitimate addition to a page about a TV show. Or "I have a big penis" is a great addition to a page on Sour Cream.

    6. Re:Step One by Anonymous Coward · · Score: 1, Informative

      mod parent up

  9. The problem is the edits going live... by Anonymous Coward · · Score: 2, Interesting

    Right now, you can think of wikipedia as having two columns per article - first is the working article column, with the second being the discussion column.

    What we really need is a third column, one for the currently published version of the article.

    While this may not be popular, it would go a long way to getting rid of the spam, and might even solve some of the other issues facing wikipedia.

    With such a system, you could even assign articles to a subject matter expert as the editor, who could approve changes, or just incorporate the best changes in.

    Not every article would need to have this, but as articles mature, they could move to this over time.

    1. Re:The problem is the edits going live... by Shoe+Puppet · · Score: 4, Informative

      A system like this has been implemented for the German Wikipedia. Almost everybody who has an account can verify articles to be vandalism-free, unless you are logged in you see the last verified version by default.

      --
      (+1, Disagree)
    2. Re:The problem is the edits going live... by s1lverl0rd · · Score: 1

      And does it work?

    3. Re:The problem is the edits going live... by Vintermann · · Score: 1

      I don't know, but I'll say this for German Wikipedia: It's a much better piece of work in my opinion. You can find huge articles with lots of great information on obscure topics, but which are written by "true fans" in a slightly unorthodox style - stuff that would be deleted in a heartbeat on English wikipedia. I don't know what they are doing, but they appear to be much more successful at accepting casual contributions.

      --
      xkcd is not in the sudoers file. This incident will be reported.
    4. Re:The problem is the edits going live... by Shoe+Puppet · · Score: 1

      My experience is very different: When looking for obscure topics, I usually head straight to the English one since the German one is often the only one that does not consider the topic to be "notable" enough for an article.

      --
      (+1, Disagree)
  10. Have a look at the Dawn Wells Talk Page by Anonymous Coward · · Score: 0

    Just have a look on the Discussion Page for "Dawn Wells" to understand why most Wikipedia Admins are Fuck Wads.

    1. Re:Have a look at the Dawn Wells Talk Page by MillionthMonkey · · Score: 1

      In 1993, Wells published Mary Ann's Gilligan's Island Cookbook with co-writers Ken Beck & Jim Clark, including a foreword by Bob Denver, to whom she had sent an envelope of marijuana through the mail years earlier.

      There, that wasn't so hard. (BTW, to all you cookbook writers out there: I can write a good foreword.)

    2. Re:Have a look at the Dawn Wells Talk Page by Neoprofin · · Score: 1, Insightful

      Seems to me like one user is trying to add a highly bias account of a single incident in her life that is many times longer than the rest of the article and throwing a screaming fir when multiple Admins tell him that it would be in violation of multiple measures of quality. Further mention of an "edit war" implies to me that the user tried to force his section in after repeated warnings and when told to file an RfC he just continues to argue first with just about anyone he can find.

      Those power abusing fuck wads.

  11. quite a bit of work on this by Trepidity · · Score: 2, Interesting

    Since the problem is tantalizingly easy to frame as a standard data-mining or machine-learning problem, albeit with some quirks, there's quite a lot of work from a lot of research groups that seems to be looking at it. Some examples: one, two, three, four, five, six, seven.

    1. Re:quite a bit of work on this by marpot · · Score: 1

      Your right, it's machine learning, data mining, NLP, and information retrieval. But the fun thing is turning a research prototype into a tool that can be left alone most of the time. That hasn't happened yet. Also, research on this problem hast started only in 2008, rule-based tools developed by Wikipedians are there since 2006. All the works you listed are acutally all there is! That's not much to work with, is it?

  12. The Art and Science of Wikipedia Vandalism by MillionthMonkey · · Score: 5, Interesting

    There is an art to Wikipedia abuse. If someone cites a Wikipedia article in some argument they're making, you can always just go to Wikipedia and edit the page so that they're wrong. But that's what a novice Wikipedia vandal does.

    A pro knows to edit the article in a very subtle way, so that it looks like the person has poor reading comprehension. Let's say the person cites a Wikipedia article with a sentence like this, in order to support the argument that Colbert is a Democrat.

    Although by his own account he was not particularly political before joining the cast of The Daily Show, Colbert is a self-described Democrat.[12][13]

    This bears the mark of authority, because of the footnote subscripts that are already on it. (We can skip the step where we maliciously relocate them here.)

    A novice might change it to this (correctly preserving the authoritative footnote superscripts):

    Although by his own account he was not particularly political before joining the cast of The Daily Show, Colbert is a self-described Republican.[12][13]

    It makes the person appear to be wrong- and the vandalism is obvious- like swapping Eurasia for Eastasia. There's no way he could have misread that.

    But change it to this

    Although by his own account he was not particularly political before joining the cast of The Daily Show, Colbert has even been described as a Democrat.[12][13]

    and the person looks not only wrong, but plausibly wrong because it looks like he can't read. That's what makes successful Wikipedia vandalism an art.

    1. Re:The Art and Science of Wikipedia Vandalism by Anonymous Coward · · Score: 0

      Wikipedia is a steaming pile of biased and petty dog crap.

  13. take your POV somewhere else by H4x0r+Jim+Duggan · · Score: 0, Troll

    Just because the tag exists, doesn't mean you can slap it everywhere you see an edit that doesn't support your world view! You deletionists are ruining Wikipedia for the rest of us. Assume Good Faith!

    1. Re:take your POV somewhere else by Anonymous Coward · · Score: 1, Funny

      Assume Good Faith!

      I have candy. Get in the van.

    2. Re:take your POV somewhere else by Kitkoan · · Score: 1

      Assume Good Faith!

      I have candy. Get in the van.

      Ehhhhhh.... in the words of Ogden Nash 'Candy is dandy, but liquor is quicker'.

      --
      Attention... all grammer nazi"s! Is they're anything; wrong with: my post,
  14. Nice template by MillionthMonkey · · Score: 4, Funny

    Whoever posted this clearly isn't aware of the actual work being done in the field. For instance, I was running a ___[thing]___ in _[year]_, and it wasn't new at the time. They've gotten much more sophisticated since then. Why are they so intent on reinventing the wheel? Do they not even realize that the wheel exists already? Why not just improve on it instead?
    * * *
    This looks like a useful template for the standard "why reinvent the wheel" Slashdot post; I hope you don't mind if I reuse it.

  15. A good step forward by allo · · Score: 1

    If it stops Deletionists from deleting well-intended edits. Better a short article than no article.

  16. The Answer has existed for years by jhary-a-conel · · Score: 1

    It was just too visionary for its time http://www.everytopicintheuniverseexceptchickens.com/

  17. An arms race? by fysdt · · Score: 2, Interesting

    I believe that vandalism on Wikipedia can be limited. But would it really be possible to detect all kinds of vandalism?

    FTA:
    "Yahoo! Research will award a cash prize of 500 Euros to the winner of the plagiarism detection task. "

    500 Euro's doesn't sound much for detecting plagiarism on a site like Wikipedia...

    1. Re:An arms race? by LtGordon · · Score: 1

      I believe that vandalism on Wikipedia can be limited. But would it really be possible to detect all kinds of vandalism?

      Without strong AI, the system can only really look for statistical and language patterns for clues on vandalism.

      If I replace an entire body section on the Fox News page with "GLENN BECK BLOWS GOATS", I would hope that a vandalism detector would flag this. If, however, I randomly insert the sentence "Glenn Beck has also been accused of inappropriate relations with barnyard animals" into a large section, then automated detection comes down to statistics or one hell of a clever context algorithm.

  18. If you want to stop 'vandalism' by Anonymous Coward · · Score: 0

    Sack William Connolley.

    That eco-terrorist has vandalised more climate-related pages (5500+) then the rest of the vandals put together.
    http://wattsupwiththat.com/2009/12/22/william-connolley-and-wikipedia-turborevisionism/

  19. What counts as vandalism on Wikipedia? by cptnapalm · · Score: 1

    I ask because I don't know. I can see turning a page into a screed as vandalism, but that doesn't differ greatly from many of the wikipedia articles that I've read; quite a few of them are overwhelmingly dedicated to hostility to the topic or advocates of the topic. Earlier today, when I was reading the news, there was a link to the Wikipedia article on the Tea Party movement: well over half of the article was dedicated to quotes from anti-Tea Party people (MSNBC, NYT, LAT, etc.) spouting off hostility to it.

    Is that vandalism?

    1. Re:What counts as vandalism on Wikipedia? by Tango42 · · Score: 3, Insightful

      Officially, vandalism is defined as edits made in bad faith. If you are trying to improve the article but are an idiot (which includes people that don't realise their own bias), that isn't vandalism, it's just idiocy. It is only if you are editing with the intention of making the article worse that you are vandalising.

    2. Re:What counts as vandalism on Wikipedia? by WolfWithoutAClause · · Score: 1

      The Wikipedia is trying to fairly reflect the reliable sources multiple positions so including 'spouting off' is not necessarily vandalism, if the neutral point of view of the reliable sources is that there is some hostility to the tea party.

      --

      -WolfWithoutAClause

      "Gravity is only a theory, not a fact!"
    3. Re:What counts as vandalism on Wikipedia? by Slashcrap · · Score: 0, Flamebait

      The people involved in the tea-bagging movement (I know they changed their name once they realised, tough shit) are objectively scum. So it's fine for the article to be negative. Also it helps to increase their paranoia regarding left-wing media conspiracies, and will hopefully bring forward the day when they really do take up arms and subsequently get murdered en-mass by the police & military, which is the optimum outcome in this case.

      What I'm really trying to say is that you sound like another whiny republican/libertarian who is "just asking questions" and I hope you die of something horrible at the first possible opportunity.

  20. Think twice before assisting this harmful project. by jonathansamuel2 · · Score: 0, Redundant

    This project will place more power in the hands of anonymous, faceless Wikipedia bureaucrats. It is therefore harmful. If Wikipedia bureaucrats are too lazy to review possibly offensive material by hand and instead want a machine to do it for them then MAYBE the world does not need that kind of Wikipedia at all.

    If you want to view a Wikipedia administrator drunk with his own sense of self-importance check this out:

    http://en.wikipedia.org/wiki/User_talk:EdJohnston#Jonathansamuel

  21. nuances and language mechanisms galore by icepick72 · · Score: 1

    As soon as you start trusting a vandalism detector over manual monitoring a lot of stuff will start to slip through, gets through the news, then the detector won't be trusted any longer. It will have a short life but will be interesting to watch.

    Sew m@ny things that can bee done to bypass mechanisms. Even simple euphemisms like cleaning the old rifle http://images.clipartof.com/small/5039-Man-Cleaning-Inside-The-Barrel-Of-His-Unloaded-Rifle-Gun-Clipart.jpg ...are sure to slip through. There are so many language mechanisms that can be used to fool automated tools, but that will be immediately recognized by people.

  22. Yeah but what about when it's not vandalism ... by PaganRitual · · Score: 2, Funny
  23. Re:Think twice before assisting this harmful proje by Tango42 · · Score: 1

    If the world doesn't want Wikipedia, they are more than welcome to stop reading it. In truth, however, it seems the world very much wants Wikipedia, since it is the 5th most popular website in the world (by unique visitors per month, if memory serves).

  24. Well-intentioned edits? by MSTCrow5429 · · Score: 1

    There are well-intentioned edits on Wikipedia? Even if there were, how could you tell...

    --
    Slashdot: Playing Favorites Since 1997
  25. In Wikipedia, everything is transparent by saibot834 · · Score: 4, Informative

    If I had mod points, I'd mod the parent up and the grandparent down. Seriously, almost everything in Wikipedia is transparent. Search the revision history and logs and look for the information you need. RTFM.

    A lot of people on /. seem to derive very general opinions about admins from a personal disappointing encounter. They do not include diffs of their edits or their username. From my experience in most cases the guy who got reverted by an admin broke some kind of rule (and often enough they just got reverted by a regular non-admin, but they assume it was an admin). Instead of RTFM those people post as AC complaining generally about admins without providing any traceable cases of admin abuse. I know my opinion isn't very popular, but unless you give concrete examples your allegations are just FUD.

    1. Re:In Wikipedia, everything is transparent by Anonymous Coward · · Score: 0

      Amen. I realize that I am an AC, but I fully agree with the above post. A lot of complaints are nothing more than "I don't like what happened". However, deriving general opinions from single incidents is not a new thing---all people everywhere do it. It is more a problem with Slashdot or internet idiocy in general, I think.

      [That said, I must point out that I have a low view of Wikipedia itself. But it has merits too, most notably that it is 100% transparent, right down to editors' motivations (see edit logs).]

    2. Re:In Wikipedia, everything is transparent by Jedi+Alec · · Score: 2, Insightful

      Hey, this is Slashdot. We're qualified to discuss any subject we damn well please based on our own prejudices and assumptions, while pretending that our high IQ's and common sense qualify us to pretend we're experts on whatever the discussion may or may not be about. What right do wiki admins have to assault our ivory towers when we sprinkle our droplets of distilled wisdom on their pages as well?

      --

      People replying to my sig annoy me. That's why I change it all the time.
    3. Re:In Wikipedia, everything is transparent by Anonymous Coward · · Score: 0

      There are many well-known examples. You are either ignorant or pretending to be ignorant.

  26. DWIM, PDCH by symbolset · · Score: 2, Funny

    You're looking for a DWIM (Do What I Meant) interpreter with PDCH (Predictive Digital Concierge Heuristics). While the technology is available it's currently quite costly. Bugs, errata, and maintenance can deliver less than an optimal experience. Might I instead offer you this mail order bride? We have imported personal assistants in stock from less privileged nations - and if you have the means we can outsource minute-to-minute management of them to our Bangalore VPDT (Virtual Presence Discipline Team). Please consult your accountant and tax lawyer concerning withholding for personal staff, particularly if you intend to pursue public service.

    /At your service!

    --
    Help stamp out iliturcy.
  27. Vandalism Detector Unecessary? by sixknowspring · · Score: 1

    From my experience with contributing to Wikipedia, and from reading some of the talkback (is that what they're called?) discussions, I don't think there's much need for such a tool; there seems to be an elite class of Wiki users that delete anything that they deem unworthy while giving the most bizarre reasons for doing so.

  28. Solution: color coding for edits by nephridium · · Score: 1

    I still think the best solution would be a color coding overlay over the text that would show the reader immediately 1.) how trustworthy the author has been and 2.) how long before the edit has been done (without being reverted). That way it would be easy to see the sections written by reputable authors who have always added useful info and distinguish it from "amendments" that have been entered just a few minutes ago by an anonymous coward.

    And for those who do not want to log in to edit, that would be fine too, if the edit stands the test of time it's highly probable that the information entered was good, so over time it will get a similar color "status" as an edit from a reputable author. It would also be easy to see last minute amendments be known authors, and as we all know, should be taken with a (larger than usual) grain of salt, no matter how well known he is ;)

    Just add a toggle button to switch between default view and the color coded view.

    BTW this system would also works very well for blogs and news sites.

    --


    And when you gaze long enough into the code, the code will also gaze into you.
  29. Re:Think twice before assisting this harmful proje by Anonymous Coward · · Score: 0

    If the world doesn't want Wikipedia, they are more than welcome to stop reading it. In truth, however, it seems the world very much wants Wikipedia, since it is the 5th most popular website in the world (by unique visitors per month, if memory serves).

    The problem isn't that the world doesn't need the Wiki, it's that the world generally misunderstands the Wiki.

    Despite any claims to the contrary, the only USEFUL information on the Wiki is the references which are cited. The articles themselves are pure trash, and in most cases are the end result of a flame war between various editors. In the end you either have a horribly, obviously biased article, a completely deleted article, or an article which has been rendered so vague as to be useless through constant edit tinkering.

    In short, the Wiki should NEVER be directly referenced for any type of citation, since (by it's own claim) the only information in the Wiki is itself backed by outside sources. So if you need to cite the Wiki, at the very least use the citations they already dug up for you. The Wiki is a starting point for information, not the destination.

  30. Re:Think twice before assisting this harmful proje by jonathansamuel2 · · Score: 1

    My hope would be that whether they read Wikipedia or not, people would not support projects like this one which place more power in the hands of Wikipedia admins. Such projects by definition place less power in the hands of ordinary Wikipedia users.

    Hopefully companies like Google will also question whether Google is deserving of $2M contributions, especially when in terms of democratic process Wikipedia is getting worse instead of better, as admins go off on their power trips with more and more powerful tools.

    Read the Wikipedia talk page for the Martin Heidegger article and you can see that parts of Wikipedia are infested with Neo-Nazi sympathizers who have the protection of a particular Wikipedia admin.

    http://en.wikipedia.org/wiki/Talk:Martin_Heidegger

  31. Vandalize the Vandals?? by draco_00 · · Score: 1

    Who cares 90% of the info on those sites are bougus anyway, it's like trying to fix the preputally broken!!

  32. Total waste of time by BradMajors · · Score: 1

    Wikipedians administrators don't seem to have a clue about the effects of vandalism.

    The time wasted by humans who's job is solely to revert vandalism is irrelevant. There are more than enough people who are willing to do this work and if they weren't doing this work they would not be contributing useful content to Wikipedia.

    The negative effects are concentrated on the knowledgeable editors who are adding useful new content. There may be 5 to 10 persons activietyl adding content to an article. Each time a change is made to the article each of these editors need to examine the content to determine if it is

  33. Vandalism, as defined by Wikipedia, by Hurricane78 · · Score: 1

    is everything that the admin establishment doesn’t agree with. Just like in a state with total censorship.
    And on top of that, the admins often don’t know shit about anything.
    Which is not surprising, considering that they most likely sit in underpants in their basement all day long. Why else would they have so much time to troll around Wikipedia on a deletion spree? Which is obviously not a very mentally healthy thing to do either.

    It’s simple: As long as Wikipedia can at all be controlled by a subset of humanity, it’s doomed to fail as a encyclopedia for all people. By definition.
    That’s why it must become a P2P system. With cascading information source rules definable by every user for himself. With everybody being able to be the publisher of his view of Wikipedia.

    Because in the end, nearly all you know, is based on the trust on other sources (human beings) anyway. (Yes, including most of what you call “facts”. Unless you checked for yourself, that information IS based on trust.)

    --
    Any sufficiently advanced intelligence is indistinguishable from stupidity.
  34. it's good at detecting OBVIOUS vandalism by capoccia · · Score: 1

    there is a subset of vandalism that a bot can be very good at detecting. this bot can never handle every kind of vandalism. for example, adding some subtly false statement to a biographical article, but spelling everything correctly, using correct grammar and adding something that looks like it could be a legitimate source is difficult for even human editors to recognize as vandalism.

    adding 1s everywhere or deleting the entire article is very easy to detect.

    1. Re:it's good at detecting OBVIOUS vandalism by s1lverl0rd · · Score: 1

      Luckily, there is a lot more obvious vandalism than there is vandalism of the sneaky kind.

  35. How do you define *Vandalism* ? by Taco+Cowboy · · Score: 4, Interesting

    Case in point --- There is an article in Wikipedia about a certain country.

    In that article, they blame their previous British colonial master for everything.

    I tried to make some corrections to that article to make it more "neutral", and they changed it back within 10 minutes.

    I tried again, and again they changed it back.

    For the third time, I was warned by someone from Wikipedia (dunno if it's a volunteer or something) that I have no right to make any correction to that particular article anymore.

    The "THEY" in question is the government of that country. They have a "cyber-patrol" group in charge of "online propaganda" and that Wikipedia article is one of their many lies, aka propaganda, they have put online.

    Now, how do you define vandalism in this case?

    --
    Muchas Gracias, Señor Edward Snowden !
    1. Re:How do you define *Vandalism* ? by svick · · Score: 1

      This is not vandalism, but violation of neutral point of view. You should try to talk with them first, not start and edit war. If talking fails, you should ask others to help you resolve the dispute, in this case the Neutral point of view noticeboard is probably the best place.

  36. qualifying your adversary by epine · · Score: 1

    Officially, vandalism is defined as edits made in bad faith.

    In other words, the scope of the problem does not include discovering the cure for human stupidity, however laudable that might be.

    Furthermore, people here are failing to apply the 80-20 rule: if you can clean up 80% of the vandalism at 20% of the human effort currently expended, the attention available to deal with the difficult twenty percent would more than triple. I've seen entire pages replaced with the word "penis" or a crass four word comment about some pimple twit schoolmate. There's a lot of low hanging fruit here.

    I sometimes think Wikipedia needs to implement a mechanism where citations are corroborated by some semi-trusted party: "yes, this citation really contains the support for the claim added to the article." Any editor who hasn't contributed a corroborated citation needs to be kept on a fairly short rope. My opinion is that the underlying currency of good faith contribution is the properly cited claims, preferably from A-list source material and not Joe Random Blog.

    How much vandalism is contributed by editors who have added fully sourced claims to three or more articles? If I've seen such a case of vandalism, I can't recall it. I've seen editors make half a dozen quasi-good faith contributions (always unsourced) who have then degenerated into petulance and destruction, perhaps when testing limits becomes a better way to get noticed.

    Most of the vandalism I've run into has been fairly fresh, using a couple of days old or at most a week. On obscure articles, I've encountered heavy vandalism that persisted unchallenged for months. In some ways the long-standing dark-corner vandalism is more problematic, like the mother-in-law who swipes her finger in some obscure crevice to document a damning laxity.

    Another case I've often seen is vandalism caught by someone inexperienced, and fixed in that instance (but not with a conspicuous revert), while ten other vandalisms from the same editor on the same spree remain unrepaired. If an unproven editor's contribution seems to be suffering a higher than normal attrition rate, then everything the editor has done should be flagged for attention.

    A lot could be built on top of a decent blame function, such as the ability to determine whether two versions of an article differ only in text, and with better exposure statistics for how often an edit has been viewed by someone who ought to know the difference.

    This article is no great bag of chips, but it contains some pertinent key phrases.

    AI comes of age

    This fellow Kroon seems to believe that augmented intelligence is the way of the future. I concur. The game is to best combine what humans do well with what the algorithms do better, combined with an effectiveness metric taking into account power law distributions, minus all the pointless hand-wringing about highly motivated adversaries escaping the cunning traps.

    Profound acts of bad faith are not remotely the same problem. It's unconscionable scope-creep to bring these worries into the petty vandalism conversion. Yes, some fraction of the thwarted petty vandals will escalate into more profound acts of vandalism. Such is life. Problems remain for the future. Many people think we've made no progress on spam. My view is that the spam filters have essentially driven all the amateur spammers out of the system. Once the level of professionalism required to get spam past the spam filters begins to equal the difficulty of doing a real job, then the flow of spam will finally begin to atrophy.

    Another example is ProPolice (or other stack smashing guards) which accomplishes nothing at all on a formal basis, but nevertheless tilts the landscape on exploit cost/benefit, and qualifies your adversaries. One of the heavy burdens on Wikipedia as it now sta

  37. Come on... by GofG · · Score: 1

    Rogue admins abusing their power? An "in" club? If you have a problem with an admin, provide evidence (a diff of the admin abusing his power) here. Follow the case, argue it out, and the admin will be dealt with. Every admin is elected in, guys. If you think Wikipedia is important enough that all the scary "rogue admins" are actually doing harm, go become a part of the election process. Anyone can vote, and your opinion matters regardless of how many edits you have, or how many articles you've worked on. This isn't like America where your vote only matters symbolically. You can stop these evil boogiemen from getting elected, if you want to. Admins aren't "above" the user. They're just the people who hold onto the brooms. It's the users who make the messes, and the users who point the messes out to the janitors. That's how it was back when I was involved in the community, anyway. Oh, cept SlimVirgin. She's a fucking fascist.

    --
    GFA/M/S d-- s: a--- C++++ UBL++$ P+ L+++ !E- W++ N+ !o K- w--- !O !M !V PS++ PE Y+ PGP+ t+++ 5- X+ R tv@ b++ DI++++ D+ G
  38. Re:Think twice before assisting this harmful proje by Jedi+Alec · · Score: 1

    Read the Wikipedia talk page for the Martin Heidegger article and you can see that parts of Wikipedia are infested with Neo-Nazi sympathizers who have the protection of a particular Wikipedia admin.

    Really? Because I actually read through the damn thing, and all I see is a debate about the difference between being a Nazi or being a National Socialist. Add a number of people acting like pompous twats, and you get an edit war, not the coming of the third reich.

    --

    People replying to my sig annoy me. That's why I change it all the time.
  39. Automated vs waiting for a human by tawker · · Score: 1

    As owner of one of the first vandalism reverto bots out there (although pattern speaking, tawkerbot2 didn't do nearly as much as CB) the first take there was if you remove the perceived vandalism almost immediately people don't get any fun out of vandalizing and stop doing it. There was massive opposition at the offset, but then, as volumes increased, people began to freak when the bot was non operational. Yes, it had false positives which needed to be dealt with, but if I recall correctly, statistically speaking, it was less than a 2% false positive rate - and this was on hundreds of thousands of edits.

  40. Re:Think twice before assisting this harmful proje by jonathansamuel2 · · Score: 1

    Those who opposed any use of the term "Nazi" in the Heidegger article argued that it was pejorative, and that Heidegger was not a Nazi, he was a National Socialist.

    A later commentator said that whether intentional or not, those who posted this drivel were attempting to rehabilitate the Nazis by arguing that they weren't Nazis at all, they were National Socialists. He mentioned Lithuania, where this process is farther along than it is here.