Slashdot Mirror


Algorithm Rates Trustworthiness of Wikipedia Pages

paleshadows writes "Researchers at UCSC developed a tool that measures the trustworthiness of each Wikipedia page. Roughly speaking, the algorithm analyzes the entire 7-year user-editing-history and utilizes the longevity of the content to learn which contributors are the most reliable: If your contribution lasts, you gain 'reputation,' whereas if it's edited out, your reputation falls. The trustworthiness of a newly inserted text is a function of the reputation of all its authors, a heuristic that turned out to be successful in identifying poor content. The interested reader can take a look at this demonstration (random page with white/orange background marking trusted/untrusted text, respectively; note "random page" link at the left for more demo pages), this presentation (pdf), and this paper (pdf)."

175 comments

  1. Light Bulb Moment by dsginter · · Score: 5, Funny

    Someone should make a wikipedia entry for this algorithm to see how trustworthy it is.

    --
    More
    1. Re:Light Bulb Moment by marcello_dl · · Score: 5, Interesting

      Sounds crappy. Let's say you expose some important misdeed. You're likely to be edited out by an army of paid staff who keeps an eye on the 'net. (don't tell me I'm paranoid because i saw it happening and read about stuff like that in the news, even slashdot). You are not contributing much else to wikipedia because you simply wanted to expose what's in your knowledge, so you'll end up with a low karma.

      Anyway, i guess it'll be another pagerank or slashdot filter affair. People trying to beat it, devs trying to make it better.

      The plus is, there is not only wikipedia. You can always search the rest of the web.
      The minus is, you search the rest of the web with google which is equivalent if not worse.

      We need a good search engine on top of a tor network, and bandwidth to make it run smooth. Not many other way to achieve real net freedom.

      --
      ---- MISSING MISCELLANEOUS DATA SEGMENT --- [sigdash] trolololol
    2. Re:Light Bulb Moment by MrNaz · · Score: 2, Insightful

      "We need a good search engine on top of a tor network, and bandwidth to make it run smooth. Not many other way to achieve real net freedom."
      Can you explain yourself a little more? I don't see how Tor would improve the quality of information being searched for. (Not arguing, just interested in your ideas)

      --
      I hate printers.
    3. Re:Light Bulb Moment by Anonymous Coward · · Score: 0

      Except, you're not a neutral point of view. Neither are your editors, but the point stands for both. Wikipedia isn't some rumor mill, or hate pit, and it really should try to report those things to its visitors. If you have a beef with a company or government or individual, Wikipedia doesn't want you to use them as a message board to attack others, even if you're right and logical.

      Wikipedia strives to be, at best, a sort of comprehensive, neutral, and objective facts for people to read about and be informed.

    4. Re:Light Bulb Moment by Anonymous Coward · · Score: 0

      Someone should make a wikipedia entry for this algorithm to see how trustworthy it is.
      Someone should make an algorithm for checking this algorithm to see how trustworthy it is.
    5. Re:Light Bulb Moment by KaiserXavier · · Score: 1

      and then an algorithm for the wikipedia entry which purpose is to check its trustworthiness? can we call it "Strange Loop"? :P

    6. Re:Light Bulb Moment by eh2o · · Score: 1

      This is just one in a long series of "quality" ranking algorithms based on analysis of information networks. The next logical step is a meta-algorithm that ranks the quality of quality ranking algorithms.Metrics will include vulnerability to attack / manipulation, bias or skewness of results, logical consistency of quality estimation, logical / semantic consistency of content judged to be trusted or coherent, and reaction time to incorporation of new information. Its only a matter of time before some poor PhD student spends their lonely years coding it.

    7. Re:Light Bulb Moment by PingPongBoy · · Score: 3, Insightful

      Sounds crappy. Let's say you expose some important misdeed. You're likely to be edited out by an army of paid staff who keeps an eye on the 'net

      Nope. If you post one misdeed and that gets edited out, such is life but shouldn't affect your credibility that much because everyone is always getting edited out a few times in the long run.

      However, if you edit hundreds or thousands of different articles and people leave you alone, o great guru, you're good.

      Wikipedia's ultimate strength depends on the community's desire for good information, readiness to stomp on crap, and will to contribute. Conversely, Wikipedia would decay if people didn't give a rat's ass about Wikipedia and let it go to ruin like an unweeded garden. This mechanism of quality control needs to be applied down the hierarchy of categories, subcategories, and articles. It's understandable that certain areas will have more pristine content overall while other areas will be populated with childish and wanton ideas. Thus, a contributor evaluation program can be tested.

      --
      Know your pads. One time pad: good for cryptography. Two timing pad: where to take your mistress.
    8. Re:Light Bulb Moment by Enlightenment · · Score: 1

      Sounds good. While he's doing that, I'll be off writing my meta-meta-algorithm.

    9. Re:Light Bulb Moment by Anonymous Coward · · Score: 0

      > The plus is, there is not only wikipedia. You can always search the rest of the web. [citation needed]
      > The minus is, you search the rest of the web with google which is equivalent if not worse. [weasel words]

      There, edited and fixed it. Let's see for how long it stays edited.

    10. Re:Light Bulb Moment by marcello_dl · · Score: 1

      It's not about the quality of what is searched for (unless we get a little paranoid and consider the case where different content is returned by a site depending on the geolocation of the request IP).

      It's about the mere act of searching being reported as suspect activity. Or avoid profiling. Needless to say the potential for abuse of this freedom is huge.

      --
      ---- MISSING MISCELLANEOUS DATA SEGMENT --- [sigdash] trolololol
    11. Re:Light Bulb Moment by marcello_dl · · Score: 1

      > Nope. If you post one misdeed and that gets edited out, such is life but shouldn't affect your credibility that much because everyone is always getting edited out a few times in the long run.

      That makes sense. But if you're one of the bad guys then all you need is a big provider with lots of ip ranges, clean up of previous cookies and altering the user agent, and you have a similar weighted counterattack for the guy who expose a misdeed.

      --
      ---- MISSING MISCELLANEOUS DATA SEGMENT --- [sigdash] trolololol
  2. Seems a bit dangerous by fymidos · · Score: 4, Insightful

    >If your contribution lasts, you gain 'reputation,' whereas if it's edited out, your reputation fails

    And the editor wars start ...

    --
    Washington bullets will simply be known as the "Bulle
    1. Re:Seems a bit dangerous by Anonymous Coward · · Score: 0

      Yeah. Wikipedia is political enough without adding reputations into the cauldron.

    2. Re:Seems a bit dangerous by N!k0N · · Score: 4, Insightful

      Yeah, that is a bit of a "dangerous" way to go about rating the content, however I think it could be a step in the right direction. If this can be improved, perhaps the site will gain a better reputation in the eyes of professors. Now, I don't doubt that there is a lot of misinformation on the site (intentional or otherwise); however, a good deal of the information I have used for research papers or to quickly check something seems to be confirmed elsewhere (texts, journals, etc).

    3. Re:Seems a bit dangerous by ajs · · Score: 2, Informative

      Editor wars are an old thing. The real concern I'd have would be how you deal with old editors who don't contribute anymore (but were "trustworthy" when they did) vs. new editors. Overall, I think it's a good idea, and I would go so far as to say that MediaWiki should offer a feature that performs this analysis for you.

      -~~~~

    4. Re:Seems a bit dangerous by fymidos · · Score: 2, Insightful

      > Editor wars are an old thing

      but they get a whole new meaning when it makes sense to find all edits by an editor, delete them, and then rewrite them as your own...

      --
      Washington bullets will simply be known as the "Bulle
    5. Re:Seems a bit dangerous by mdwh2 · · Score: 1

      One problem is that having your edit edited out at some point in the future doesn't mean there was anything bad about your edit. Sometimes even though your edit was a clear improvement, an entire paragraph might be removed because it's deemed to be unsuitable/unnecessary, or someone rewrites the entire paragraph because they think they can make it better still.

      On the other hand, if it specifically only penalises "undone" edits (which is seems to refer to in the article), i.e., your change and only your change is reverted, this would avoid the problem.

      I wonder if it includes reverting vandalism as being an edit attributed to you? (Otherwise, you could vandalise an established piece of text under a sockpuppet/anonymous account, then revert it, gaining the credit for the text ... hopefully they've taken such things into account.)

    6. Re:Seems a bit dangerous by space_in_your_face · · Score: 1

      Look page 22 of the presentation. They keep track of authorship of a text. So if someone does what you are telling, it has no positive effect on his reputation.

    7. Re:Seems a bit dangerous by NickCatal · · Score: 1

      If you read it it does say that reverting vandalism will improve your reliability.

      Only problem is, if I continually revert vandalism, am I not also inflating my own rep when I decide to go make an edit that turns out to be incorrect?

      It is not only a really good idea, it is a GREAT idea, but as a Wikipedia editor who has introduced some incorrect facts (and changed them back later or had them changed for me, thankfully) into the site I am a little worried on how much trust it gives each user. I have a lot of edits, most of which are still on the site, but that does not mean that my information is correct.

      Not to mention Wikipedia already has a server-crunch, this would take an insane amount of CPU to do in real-time, which the Wikimedia Foundation just cannot afford.

      --
      -nick
    8. Re:Seems a bit dangerous by fymidos · · Score: 1

      They keep the authorship of all text, but if i delete somebodys text, copy it, and submit it as mine, the authorship changes. So, yes, it does affect his reputation because the text that belongs to him will be deleted.

      --
      Washington bullets will simply be known as the "Bulle
    9. Re:Seems a bit dangerous by xappax · · Score: 2, Insightful

      If this can be improved, perhaps the site will gain a better reputation in the eyes of professors.

      No, it won't gain a better reputation in the eyes of professors (at least decent professors) for two reasons:

      1) It's an inherently flawed algorithm and easily gameable. It's useful as a very vague unreliable data-point, and not much else.

      2) Wikipedia is not a source for academic research, and never will be. If it's anything to academics, it's a place to go to get some clues on how to proceed with their real research - for example finding links to reliable sources, or related terms and concepts. It's like Google: a great tool for research, not a source.

      Wikipedia is not and has never claimed to be an authoritative source on anything, and until people stop referring to it as though it is (or could be, or claims to be) - we'll never get over this wanking about "Don't trust wikipedia, it's not reliable - anyone can change it, omg!"

    10. Re:Seems a bit dangerous by Anonymous Coward · · Score: 0

      It is dealed with in the presentation. Gonna look into paper...

      In short: If your reverse is also undone, you lose reputation, the one whose edits you cancelled not.

    11. Re:Seems a bit dangerous by ajs · · Score: 2, Informative

      You're not actually reading the text that they linked to, are you?

      We're not talking about Wikipedia's concept of authorship, here, but the tool's. The tool tracks who first wrote something and doesn't re-assign authorship because it was removed (e.g. by a vandal) and then restored.

      You would have to remove what they wrote and then restore it in your own words in such a way that your edit was good enough to be retained by the community. In which case, the system worked.

      Overall, I think it would be an excellent thing.

    12. Re:Seems a bit dangerous by Bongo+Bill · · Score: 1

      And the editor wars start ... You misspelled "continue."
      --
      ...but is it art?
    13. Re:Seems a bit dangerous by l0b0 · · Score: 1

      Sounds to me like this should really be made into a GreaseMonkey script or Firefox extension, to avoid having an "official" algorithm that everybody will try to appease.

  3. Hmmm... A reputation metric... by Colin+Smith · · Score: 3, Funny

    It'd be nice if it could be generalised to other sites...

    --
    Deleted
    1. Re:Hmmm... A reputation metric... by PJ1216 · · Score: 1

      I doubt it could be. The metric itself would be inherently different on various pages. The metric here is longer lasting implies a trustworthy source. On other sites, this metric may be worthless or the exact opposite (sites that constantly update or change, etc.) This metric is tailored to a wiki, so maybe other wiki-sites, but not other sites in general.

    2. Re:Hmmm... A reputation metric... by zeromorph · · Score: 1

      Ssssh. I've know something better: reputation_algorithm 2.0, just let people do it, call "reputation" "karma" (just for the geek factor) and I predict it will be a great success at least in the stranger corners of the internet.

      --
      "Hannibal's plans never work right. They just work." Amy/A-Team
    3. Re:Hmmm... A reputation metric... by aadvancedGIR · · Score: 0

      I think he was joking about the long-established /. karma system, Mr 7-digit ID.

    4. Re:Hmmm... A reputation metric... by uigin · · Score: 1

      It basically is... in the form of the Google pagerank algorithm (instead of editors they use linkers).

      Dave.

    5. Re:Hmmm... A reputation metric... by Anonymous+Brave+Guy · · Score: 1

      I wonder whether nominating an editor on Wikipedia a "karma whore" will result in a net increase or decrease of "reputation" for the nominee. :-)

      --
      If you disagree, post your argument. (-1, Overrated) isn't your personal censorship tool for views you don't like.
    6. Re:Hmmm... A reputation metric... by shani · · Score: 1

      I think he was joking about the long-established /. karma system, Mr 7-digit ID.

      Don't be too hard on him. The humor of us old-timers is often missed on you kids today.

    7. Re:Hmmm... A reputation metric... by heinousjay · · Score: 1

      You forgot to put humor in quotes.

      Or perhaps I should say:

      In Soviet Russia, humor quotes you.

      Yeah, that gets my point across a whole hell of a lot better.

      --
      Slashdot - where whining about luck is the new way to make the world you want.
    8. Re:Hmmm... A reputation metric... by BrettJB · · Score: 1

      Actually, I believe the correct Smirnoff reversal of your statement would be:

      In Soviet Russia, quotes forgot to put humor in you.

      Damn, did I actually just stoop to the "In Soviet Russia" level? Hot grits and Beowulf clusters, here I come!

      --
      Smell that? You smell that? Burning karma, son. Nothing in the world smells like that...
  4. Godwin's Second Law by Anonymous Coward · · Score: 3, Insightful

    Every paper touting automatic adjustments for gaming the system becomes obsolete the moment it is published.

    (Godwin didn't publish this, but I might get around to editing his Wikipedia entry to say that he did).

    1. Re:Godwin's Second Law by Skippyboy · · Score: 1

      And just to prove his First law - Shut up you freakin' NAZI!!! :-)

  5. 7 years??? by Anonymous Coward · · Score: 3, Interesting

    I've been noticing some of the edit histories for articles that are 5 years old on Wikipedia stop well before 5 years ago. Were some of the edit histories been lost or deliberately truncated?

    1. Re:7 years??? by Anonymous Coward · · Score: 2, Informative

      Some edit histories are also completely messed up in random order. Look at the weird edit history for Wikipedia's article on Pi "Revision as of 21:54, 8 September 2002" precedes older revision "Revision as of 06:17, 5 December 2001"
        How can we trust the Wikimedia software if it corrupts the edit database?

    2. Re:7 years??? by Stooshie · · Score: 1, Informative

      RTFA!

      • The demo is based on the Wikipedia dump dated February 6, 2007. The demo contains pages that are contiguous in the dump; pages were not selected manually or individually. The demo contains the last 50 revisions of each page (or fewer, for pages with fewer revisions).
      • Occasonally, the coloring breaks the Wikimedia interpretation of the markup language. We are trying to resolve all such issues by locating the coloring information appropriately.
      • The algorithms are still very preliminary.
      • No, you cannot edit the pages. :-)
      --
      America, Home of the Brave. ... .and the Squaw.
    3. Re:7 years??? by Goaway · · Score: 1

      Perhaps you should, instead, try to read and understand the post that you are replying to.

    4. Re:7 years??? by Veinor · · Score: 1
      There are two ways to remove an edit from an article's edit history. One is for an admin to delete the article and restore all the revisions but the ones he wants gone. But other admins can still delete it. There is also oversight, which removes it from all users except others with oversight, and is only used in rare cases. From the oversight page:

      This feature is approved for use in three cases.
      1. Removal of nonpublic personal information such as phone numbers, home addresses, workplaces or identities of pseudonymous or anonymous individuals who have not made their identity public.
      2. Removal of potentially libellous information either: a) on the advice of Wikimedia Foundation counsel or b) when the subject has specifically asked for the information to be expunged from the history, the case is clear, and there is no editorial reason to keep the revision.
      3. Removal of copyright infringement on the advice of Wikimedia Foundation counsel.


      Oversight is extremely rare; only 28 users have it. Of course, maybe the article just hasn't been edited since then.

  6. If by Joseph1337 · · Score: 0

    you do something wrong, it will mod you "-1, Troll"

    1. Re:If by Tribbin · · Score: 2, Funny

      Whereas the implementation of "+1 funny" will be the end of the information age.

      --
      If you mod this up, your slashdot background will turn into a beautiful sunset!
  7. Doesn't take into account common myths by Cryophallion · · Score: 5, Interesting

    So, if there is a myth that a lot of people believe is true, then it will stay up there as it is not challenged. So, it still gets reputation, and therefore more credibility, making it more likely that the myth will be perpetrated.

    Also, if someone hasn't noticed something that is wrong on an esoteric entry, it will also be given credibility, and once again be more likely to be considered to be fact.

    While you could add voting to the algorithm to have people vote on whether it is true, that still gets destroyed by someone who just votes because they think it's true, not because they have verified it.

    Either way, it potentially gives additional credibility to something that may be very wrong.

    1. Re:Doesn't take into account common myths by SQLGuru · · Score: 1

      Another way to increase your standing is to invent pages of content noone would ever go to (Xpi - a specific hovercraft model) or to just make small grammatical shifts so that your updated content ages while someone more reliable loses credibility.

      Layne

    2. Re:Doesn't take into account common myths by Loke+the+Dog · · Score: 1

      Well, the core of the issue you're presenting is that people will assume that the facts are correct if the article has a good rating, and you are right that most people will. But I have a feeling that the articles with high rating will also be very carefully reviewed by dedicated wikipedians, just like featured articles are. Wikipedians with a high credibility rating will not be the ones who remove sections that refer to good research, so all it takes is for one wikipedian to find good research, write correct information based on it, and then refer to it.

    3. Re:Doesn't take into account common myths by hsqueak · · Score: 1

      Surely it's not the entries themselves that directly gain credibility, but the parts written by authors who have previously shown themselves to be accurate and trustworthy (i.e. their edits generally remain for longer)? Wikipedia encourages references, so all it could take is one person making an edit and referencing an expert, reliable source showing the correct non-myth to change the angle of the article.

    4. Re:Doesn't take into account common myths by mdwh2 · · Score: 1

      So, if there is a myth that a lot of people believe is true, then it will stay up there as it is not challenged.

      You mean if it is not challenged. Wikipedia is quite clear on this: The threshold for inclusion in Wikipedia is verifiability, not truth. Certainly I and other editors question, and even eventually remove material even if it may seem to be true, because there is no evidence for it and no way to verify it. Now yes, there is the problem that not all editors follow this, and there is the general problem with this algorithm that some articles have dodgy information that goes unnoticed a long time, thus they'd be marked as trustworthy. But the solution is not to have people voting on the whether it is true (that would be a horrendous idea) - the answer is to remove unverifiable material.

    5. Re:Doesn't take into account common myths by alx5000 · · Score: 1

      Also, if someone hasn't noticed something that is wrong on an esoteric entry
      Cryophallion, a wiki entry is not food.
      --
      My 0.02 cents
    6. Re:Doesn't take into account common myths by Cryophallion · · Score: 0

      Are you saying that esoteric is a food term?

      See definition #2. I meant it as a thing that few know about or understand, so there may be few people who can properly cite verifiable materials.

      And it IS food for the brain...

      esoteric (s'-tr'k) Pronunciation Key
      adj.

            1.
                        1. Intended for or understood by only a particular group: an esoteric cult. See Synonyms at mysterious.
                        2. Of or relating to that which is known by a restricted number of people.
                        3. Confined to a small group: esoteric interests.
                        4. Not publicly disclosed; confidential.
            2.
                        1. Confined to a small group: esoteric interests.
                        2. Not publicly disclosed; confidential.

    7. Re:Doesn't take into account common myths by alx5000 · · Score: 1

      Man, I know. I'm sorry if you missed the joke.

      --
      My 0.02 cents
    8. Re:Doesn't take into account common myths by Cryophallion · · Score: 0

      Great, I miss one episode of family guy and a joke goes right over my head.

      Dang it. Now I lost all my geek cred.

    9. Re:Doesn't take into account common myths by welcher · · Score: 1

      Just to make sure a strange word usage is not perpetuated, I thought I'd point out that myths are not "perpetrated".

    10. Re:Doesn't take into account common myths by mcrbids · · Score: 1

      So, if there is a myth that a lot of people believe is true, then it will stay up there as it is not challenged. So, it still gets reputation, and therefore more credibility, making it more likely that the myth will be perpetrated.

      Yep. There are lots of these. Snopes is full of these - "everybody knows it's true" but yet it's false.

      Also, if someone hasn't noticed something that is wrong on an esoteric entry, it will also be given credibility, and once again be more likely to be considered to be fact.

      Oh, you were talking about Wikipedia - but there to, in real life, we have the same dynamic at work. Such as, for example, the "sovereign" status of Sealand. If you look at the legal history of this claim, it's simply a case of "no court has bothered to rule against them". So it stands as a claim.

      While you could add voting to the algorithm to have people vote on whether it is true, that still gets destroyed by someone who just votes because they think it's true, not because they have verified it.

      Either way, it potentially gives additional credibility to something that may be very wrong.


      If a bunch of people vote on it for whatever reason they prefer, then that vote stands. (at least in theory, the last two US elections are good example of how this can go wrong) There will ALWAYS be a fight to preserve truth over opinion. And one of the best examples of this is Slashdot!

      How many posts modded funny do you see about the "Blue Screen of Death"? I think I've actually seen a BSOD maybe once in the last three years - XP is a far cry from Win98. How many highly ranked posters here haven't the foggiest clue how copyrights, patents, and trademarks work, or even are aware that there's a difference between them?

      How many people here actually believe that if any GPL code here is found in the Windows Source code, then all of Windows must then be open-sourced, and that they would then have a legal right to demand this? (I'm probably going to get modded to -1 troll for this last question)

      See what I mean?

      --
      I have no problem with your religion until you decide it's reason to deprive others of the truth.
    11. Re:Doesn't take into account common myths by Anonymous Coward · · Score: 0

      Yep, common myths such as "Common Descent" and "Global warming".

  8. Seems to work ... by Purity+Of+Essence · · Score: 5, Funny

    Seems to work, the entire page turned orange.

    --
    +0 Meh
    1. Re:Seems to work ... by Alsee · · Score: 1

      What does blue mean?

      -

      --
      - - You can't take something off the Internet! That's like trying to take pee out of a swimming pool.
  9. Warning! Warning! by Anonymous Coward · · Score: 0

    binutils-2.18.tar.{gz|bz2} is ya GPLv3!

  10. hmmm... by PJ1216 · · Score: 5, Funny

    They should just call it wiki-karma.

    1. Re:hmmm... by Magada · · Score: 1

      Ac'lly, they shoyld call it wiki-staleness. Pages/sections which are not edited for AGES should be marked in a sickly green and flagged for editing, as the information is likely to have been obsoleted in some way (yes, even historical information).

      --
      Something bad is coming when people are suddenly anxious to tell the truth.
  11. #REDIRECT by Chris+Pimlott · · Score: 4, Insightful

    It appears they include #REDIRECT pages; the very first page the random link took me to was Cheliceriformes, with the #REDIRECT line in orange. Seems an easy way to gain trust, once a redirect is created it is hardly ever changed.

    1. Re:#REDIRECT by Chris+Pimlott · · Score: 1

      Whoops, I misread the summary; I thought orange was trusted, so maybe they have special consideration for redirects. Or maybe that one redirect is a fluke; I can't tell now that the /.ing has begun.

    2. Re:#REDIRECT by UnHolier+than+ever · · Score: 1

      That's only if the redirect points to the correct page. If it is vandalism, or if it points to article with little relevance to the term searched for, the redirect will be removed, hence losing reputation.

  12. I dunno about this system. by Wilson_6500 · · Score: 5, Insightful

    Does it take into account magnitude of error corrections? If major portions of someone's articles are being rewritten, that's a good reason to de-rep them. If someone makes a bunch of minor spelling or trivial errors, then that's not necessarily a reason to do so.

    And, of course, there is the potential for abuse. If the software could intelligently track reversions and somehow ascribe to those events a neutral sort of rep, that would probably help the system out.

    As it stands, they're essentially trying to objectively judge "correctness" of facts without knowing the actual facts to check. That's somewhat like polling a college class for answers and assigning grades based on how many other people DON'T say that they disagree with a certain person in any way.

    1. Re:I dunno about this system. by IBBoard · · Score: 1

      I was thinking the same thing. Surely it would also penalise you if you didn't take a sufficiently approved writing style and it was re-written, including the same facts but different words, or if the article was restructured and that caused some rewording.

      I guess it's a start, but pure longevity of content isn't the best metric for trustworthiness.

    2. Re:I dunno about this system. by ajs · · Score: 1

      I would expect it to consider the percentage of original text that remains. For example, if you write 2 paragraphs and over time 20 words are edited by others, that's a pretty decent rate.

    3. Re:I dunno about this system. by shird · · Score: 1

      Although vandalism and repeat offendors etc often make small edits, such as inserting a 'not' or repeatedly inserting a url or something. Someone who makes a large edit has gone to a lot of effort, and this is probably more trustworthy than soemone who makes a quick hack. People who insert intentionally misleading information wouldn't go to too much effort to just have their edits reverted or easily noticed, so would just make small edits.

      So I don't think someone who goes to a lot of effort to insert a large amount of text (read: facts) that gets proofread and tidied up or rewritten by others should be penalised.

      --
      I.O.U One Sig.
    4. Re:I dunno about this system. by ceoyoyo · · Score: 1

      They're trying to assess trustworthiness, not correctness. They're different.

      They've probably fallen slightly short of trustworthiness, but they've hit truthiness straight on.

  13. I suspect this heuristic measures.... by Anonymous Coward · · Score: 5, Insightful

    the relative controversy of the item being edited.

    If I edit a history page of a small rural village near where I live, I can guarantee that it will remain unaltered. None of the five people who have any knowledge or interest in this subject have a computer.

    If I edit an item on Microsoft attitude to standards, or the US occupation of Iraq, I'm going to be flamed the minute the page is saved, unless I say something so banal that noone can find anything interesting in it.

    But my Microsoft page might be accurate, and my village history a tissue of lies....

    1. Re:I suspect this heuristic measures.... by zoney_ie · · Score: 1

      A small rural village? Unless you've put a reasonably decent bit of well-presented detail up about it, there's every possibility it'll be deleted as unverified, non-notable, etc., etc.

      --
      -- *~()____) This message will self-destruct in 5 seconds...
  14. Tuned for Subject Matter by erroneous · · Score: 5, Insightful

    Sounds like a worthy start to the process of introducing more trustworthyness into Wikipedia entries, but this maybe needs tuning for content type too.

    Afterall just because someone is a reliable expert at editing the wikipedia entries on Professional Wrestling or Superheroes doesn't necessarily mean we should trust their edits on, for instance, the sensitive issues of Tibetan sovereignty.

    --
    erroneous: look me up in a dictionary
    1. Re:Tuned for Subject Matter by xappax · · Score: 1

      Sounds like a worthy start to the process of introducing more trustworthyness into Wikipedia entries

      It's not, and the reason is that any attempt to introduce more "trustworthiness" into Wikipedia is a waste of time. People distrust Wikipedia because of its most basic, core concept: anyone can contribute. In order to get these people to trust Wikipedia, you'd have to eliminate that core concept. A trust system like this algorithm will just prompt "nay-sayers" to point out how it's not reliable either - and it's not. The more people try to shoehorn trust and reliability into the Wikipedia model, the more criticism it creates because people think Wikipedia is making a claim to reliability that is obviously false. It seems like someone's trying to trick them, and nobody likes that.

      In a nutshell, Wikipedia isn't reliable, it's not meant to be, it's not going to be. Get over it and appreciate it for what it is, an often useful community-maintained reference.

    2. Re:Tuned for Subject Matter by ASBands · · Score: 1

      While you're absolutely correct, you must also factor in how people generally behave. How often is a reliable expert on Professional Wrestling going to edit the issue of Tibetan sovereignty? Sure, somebody could, but the goal isn't to get a black and white objective analysis of right and wrong, it's to get a gray area subjective rating of "trustworthiness." Better yet, it appears to work - in my article on the politics of Djibouti, all the "facts" were highlighted.

      --
      My UID is a prime number. Yeah, I planned that.
    3. Re:Tuned for Subject Matter by rm999 · · Score: 1

      That's the whole point - he isn't trustworthy if he is going around editing things he doesn't know about. Theoretically, his edit to Tibetan sovereignty will be removed if he adds something untrue, which will hurt his trustworthiness. Additionally, if he wants his trustworthiness to remain high, he won't be editing too many things he doesn't know about. This trustworthiness number attached to him will pressure him to edit only things he knows about, which is a win for the site.

      This metric reminds me a little of Google's pagerank. At the surface, linking to other sites to "vote" for them has a lot of problems - it's easy to list several. But when you look at it at a macro scale, a lot of these problems smooth out and it ends up working better than any other algorithms.

  15. algorithmic argumentum ad verecundiam by bareman · · Score: 1

    It's practically an automatic with people so codifying it for machine should be no surprise.

    http://en.wikipedia.org/wiki/Appeal_to_authority

    1. Re:algorithmic argumentum ad verecundiam by iago-vL · · Score: 1

      Sorry, but according to the algorithm here, that page isn't trustworthy.

  16. Unpopular but neutral points of view? by Knuckles · · Score: 5, Interesting

    I realize that an encyclopedia by definition will always emphasize the established majority opinion about any given subject. But it seems that this tool might strengthen majority opinions beyond what is reasonable. If you happen to edit an article by adding valid but unpopular dissenting points of view, and the other contributors are sufficiently boneheaded, you lose karma (or whatever the tool calls it) for no good reason. This might then easily develop a life of its own, and you are screwed.

    --
    "When I first heard Daydream Nation it quite frankly scared the living shit out of me." -- Matthew Stearns
    1. Re:Unpopular but neutral points of view? by allthingscode · · Score: 1

      All wikipedia entries should have some reference to source material to be considered valid. This should be true before or after this tool.

      This sounds to me like the wikipedia version of Google's link count algorithm. This got me to thinking though: wikipedia is old enough that other articles are referencing it. Why not use the link counts to quantify the credibility of a page as well as this?

    2. Re:Unpopular but neutral points of view? by Knuckles · · Score: 1

      Both are good points, I think. But while articles indeed should have links to sources, by far not all do. And by far not all information that isn't backed up by source links right now is worthless or wrong.

      --
      "When I first heard Daydream Nation it quite frankly scared the living shit out of me." -- Matthew Stearns
    3. Re:Unpopular but neutral points of view? by ajs · · Score: 1

      Generalize that to controversy of any form. I have spent some time editing articles that focus on bigotry, genitalia and other topics which get a lot of vandalism... I wonder how that would be dealt with....

    4. Re:Unpopular but neutral points of view? by oni · · Score: 1

      All wikipedia entries should have some reference to source material to be considered valid.

      That's true, but that's not what this tool is looking at. I might go edit the conservapedia, adding valid, multiply-sourced facts - but then immediately get a revert. And this would happen again and again.

      Now this tool comes along and says, "ah ha! everything this guy writes gets reverted. He's obviously not trustworthy." And now my supposed untrustworthiness is used as an excuse to remove everything else I contribute.

      This could be a tool for group-think (more so that wikipedia already is)

    5. Re:Unpopular but neutral points of view? by tepples · · Score: 1

      And by far not all information that isn't backed up by source links right now is worthless or wrong. One of Wikipedia's core policies is that if you can't find a reliable source for an assertion, you should probably delete the assertion from the article. Wikipedia wants verifiability, not truth.
    6. Re:Unpopular but neutral points of view? by Knuckles · · Score: 1

      I know, we still have to deal with reality though. Saying "this is not a problem because all statements should have sources" is not helpful, when a huge number of statements don't.

      --
      "When I first heard Daydream Nation it quite frankly scared the living shit out of me." -- Matthew Stearns
    7. Re:Unpopular but neutral points of view? by Anonymous Coward · · Score: 0

      And what if you're constantly being censored? For example, what if I edit Oprah's entry with factual information that some Oprah fan doesn't like and so changes the content that I've entered, decreasing my 'reliability' points. Not that I'd ever say anything bad about Oprah.

    8. Re:Unpopular but neutral points of view? by Knuckles · · Score: 1

      Thanks, you expressed my point much more clearly than I could.

      --
      "When I first heard Daydream Nation it quite frankly scared the living shit out of me." -- Matthew Stearns
    9. Re:Unpopular but neutral points of view? by gplus · · Score: 2, Funny

      And by far not all information that isn't backed up by source links right now is worthless or wrong. I want that sentence taken outside and shot. :)
    10. Re:Unpopular but neutral points of view? by Knuckles · · Score: 1

      Funny. But you know what I mean: often you come across a statement that you know is true, but is tagged to need a citation. Now, I agree that such statements are not particularly great in an encyclopedia, but the point still stands that the lack of a link does not per se make the statement untrue.

      This is made more difficult in case of topics that are (a) not well documented in general, and (b) happened before the internet became mainstream. Anyone who has ever searched for little-known, pre-internet stuff that did not originate in the US, or, even worse, only existed in a little country somewhere, knows how frustrating this can be. Often not even printed material exists.

      Should all this information be simply eradicated from history?

      --
      "When I first heard Daydream Nation it quite frankly scared the living shit out of me." -- Matthew Stearns
    11. Re:Unpopular but neutral points of view? by alx5000 · · Score: 1

      I realize that a community blog by definition will always emphasize the established majority opinion about any given subject. But it seems that this tool might strengthen majority opinions beyond what is reasonable. If you happen to post a comment by adding valid but unpopular dissenting points of view, and the other contributors are sufficiently boneheaded, you lose karma (or whatever the tool calls it) for no good reason. This might then easily develop a life of its own, and you are screwed.
      --
      My 0.02 cents
    12. Re:Unpopular but neutral points of view? by Knuckles · · Score: 1

      Funny, but I don't see that in reality. I read /. at threshold -1 to watch moderation misuse, and despite the frequent bitching about groupthink, it is extremely rare that anyone is erroneously modded to even 0, much less -1.

      --
      "When I first heard Daydream Nation it quite frankly scared the living shit out of me." -- Matthew Stearns
    13. Re:Unpopular but neutral points of view? by SL+Baur · · Score: 1

      Yeah, that's a good point. Sometimes a valuable reference disappears too.

      One time when I was really bored, I went through the Wikipedia checking all the entries I could find on baseball players who had played professional baseball in Japan. In general they were pretty bad, often times not even mentioning specific teams let alone any statistics.

      There used to be a really excellent site that had rosters and statistics going back at least into the 90's. It was even translated into English and unfortunately, it no longer exists.

  17. Good content bad editors by Anonymous Coward · · Score: 0

    What if good truthful content is being removed by BAD (malicious) editors? How does the algorithm account for that?

  18. the whole internet needs something like this by Anonymous Coward · · Score: 0

    We fairly urgently need to assess the reliability of the various sources on the net becaues the s/n ratio is getting so low that the net becomes harder and harder to use for real work.

    more addictive than crack, do not click :)

  19. Tyranny of the majority by G4from128k · · Score: 5, Insightful

    Although this method will certainly help filter pranks and cranks, it won't help if the "consensus" among wikipedia authors is wrong. If a true expert edits a page, but the masses don't agree with the edit, they will undo the expert's addition and give the expert a low reputation. Thus, the trust rating becomes a tool for maintaining erroneous, but popular ideas.

    That said, I can't help but believe that this tool is a net positive because it makes points of debate more visible. One could even argue that it literally highlights the frontiers of human knowledge. That is, high-trust (white) text is well known material and highlighted (orange) text represents contentious or uncertain conclusions.

    --
    Two wrongs don't make a right, but three lefts do.
    1. Re:Tyranny of the majority by Anonymous Coward · · Score: 0
      Sounds like Slashdot.

      :%s/wikipedia authors/slashdot moderators/g
      :%s/edits a page/posts a comment/g
      :%s/the edit/the comment/g
      :%s/undo/mod down/g
      :%s/(white)/(+5)/g
      :%s/(orange)/(-1)/g

      Although this method will certainly help filter pranks and cranks, it won't help if the "consensus" among slashdot moderators is wrong. If a true expert posts a comment, but the masses don't agree with the comment, they will mod down the expert's addition and give the expert a low reputation. Thus, the trust rating becomes a tool for maintaining erroneous, but popular ideas.

      That said, I can't help but believe that this tool is a net positive because it makes points of debate more visible. One could even argue that it literally highlights the frontiers of human knowledge. That is, high-trust (+5) text is well known material and highlighted (-1) text represents contentious or uncertain conclusions.

      If anything, this at least exposes the hideous inappropriateness of phrases like "frontiers of human knowledge" when talking about a goddamn website (spoiler: Wikipedia users aren't smarter or more knowledgeable than Slashdot users).

    2. Re:Tyranny of the majority by Anonymous Coward · · Score: 0

      I'm more worried about pandering and "reputation whoring" turning wikipedia into just another web bbs.

    3. Re:Tyranny of the majority by Anonymous+Brave+Guy · · Score: 4, Insightful

      Yes, this system demonstrates the correlation between the content and the majority opinion, not between the content and the correct information (assuming such objectively exists).

      Of course, if you take as an axiom that the majority opinion will, in general, be more reliable than the latest random change by a serial mis-editor, then the correlation with majority opinion is a useful guideline.

      Something that might be rather more effective, though perhaps less practical, is for Wikipedia to bootstrap the process much as Slashdot once did: start with a small number of designated "experts", hand-picked, and give them disproportionate reputation. Then consider secondary effects when adjusting reputation: not just whether something was later edited, but the reputation of the editor, and the size of the edit.

      This doesn't avoid the underlying theoretical flaw of the whole idea, though, which is simply that in a community-written site like a wiki, edits are not necessarily bad things. Someone might simply be replacing the phrase "(an example would be useful here)" with a suitable example. This would be supporting content that was already worthwhile and correct, not indicating that the previous version was "untrustworthy".

      --
      If you disagree, post your argument. (-1, Overrated) isn't your personal censorship tool for views you don't like.
    4. Re:Tyranny of the majority by Ctrl-Z · · Score: 1

      Maybe the article heading should have said "Algorithm Rates Truthiness of Wikipedia Pages". Isn't that what's happening here?

      --
      www.timcoleman.com is a total waste of your time. Never go there.
    5. Re:Tyranny of the majority by Anonymous Coward · · Score: 0

      Although this method will certainly help filter pranks and cranks, it won't help if the "consensus" among wikipedia authors is wrong.

      Meh. If you don't care about that, then there's a much cheaper heuristic. If the article contains "test", "is gay" or "faggot", then it doesn't get enough eyeballs to be trustworthy. And before you point out false positives, I'm sure the method described in TFA has false positives too.

    6. Re:Tyranny of the majority by mdwh2 · · Score: 1

      I agree that I'm not sure this tool will be a good thing, but:

      If a true expert edits a page, but the masses don't agree with the edit, they will undo the expert's addition and give the expert a low reputation.

      If a true expert isn't providing sources for his edits, then tough luck - it's reasonable they get reverted. (There is a potential problem where edits-with-sources get reverted by some idiot editor, and you can't be bothered to seek help from other editors - though that applies to everyone, not just "true experts".)

    7. Re:Tyranny of the majority by costas · · Score: 1

      This is just a start; you can go a long ways trying to determine who's an expert by checking the extent and trustworthiness of their contributions within an article "cluster" (where clusters can be determined through link graphs or content correlation).

      Wikipedia can also try using implicit "voting" on articles by tracking how many of their users have read a page and "approved" it by not changing it. Here your vote can also be linked to your trustworthiness. And of course you can have explicit voting /.-style. The problem with these approaches is that it would be harder to share the voting data with sites using Wikipedia's content.

    8. Re:Tyranny of the majority by Spy+Hunter · · Score: 1

      it won't help if the "consensus" among wikipedia authors is wrong.
      In many cases, knowing the popular consensus on an issue is just as important as knowing the truth. Wikipedia is tremendously useful as long as one understands the difference.

      I'll also point out that consensus is the best approximation to truth available to us as a society (flawed though it is); consensus is the foundation of both democracy and the scientific peer review process. If you found something better than consensus at determining truth, you'd have the basis for a government better than democracy and a scientific process better than peer review.
      --
      main(c,r){for(r=32;r;) printf(++c>31?c=!r--,"\n":c<r?" ":~c&r?" `":" #");}
    9. Re:Tyranny of the majority by rhizome · · Score: 1

      Yes, this system demonstrates the correlation between the content and the majority opinion, not between the content and the correct information (assuming such objectively exists).

      This objectivity does not and can not exist. Godel proved this one in mathematics before Derrida popularized it in literary criticism.

      --
      When I was a kid, we only had one Darth.
    10. Re:Tyranny of the majority by Anonymous+Brave+Guy · · Score: 1

      I think you're being a little too clever. If you're talking about an axiomatic system, it's pretty objective to state the axioms, for example...

      --
      If you disagree, post your argument. (-1, Overrated) isn't your personal censorship tool for views you don't like.
  20. Algorithms are handy by rinkjustice · · Score: 1, Offtopic

    when things can be quantified and measurable. I've always wondered about the algorithm of a brand's worth. What is the logo's value, in relation to the slogan, and the consumer experience?

    For instance, Google has a strong brand, despite their hideous logo and "Don't be evil" slogan, because the consumer experience is so good. Coca-Cola, on the other hand, score big with their logo's distinctive cursive script, despite ongoing critisms of its health effects and numerous allegations of wrongdoing by the company. And their product just isn't that good.

    Man, I would loves me an algorithm for that.

    1. Re:Algorithms are handy by nyctopterus · · Score: 1

      [Coca-Cola's] product just isn't that good.

      I don't think that true. There are a lot of drinks I prefer the taste of, but I can't drink a lot of them day in and day out like I can with coke. I think Coca-Cola's great strength as a product is that people don't get sick of it quickly.
    2. Re:Algorithms are handy by rinkjustice · · Score: 1

      Alot of products taste good, and yet don't dominate the market like Coke does. You have to admit there is more than simply taste that's involved.

    3. Re:Algorithms are handy by nyctopterus · · Score: 1

      Uh, did you read what I wrote? I said other products taste better. It's not about Coca-Cola tasting good per se, it's about being able to drink a lot of it regularly without it becoming too boring/sickening. I think it has to do with the cola flavour, hence Pepsi Cola being the other big player.

      Sure, a lot of Coke's dominance has to do with marketing, but the product (Cokes cola flavoured beverage) is a good one.

    4. Re:Algorithms are handy by rinkjustice · · Score: 1

      I gotcha, and I think you've made a good point. But we come back to the marketing of it, and much of a company's marketing success relies on the image, the personality if you will, of the product. It's referred to as branding, and there are many aspects of branding that need to be considered (I'll point to an article I wrote on the very subject). What I'd like to know and haven't been able to get an answer to, is, is there a branding formula or algorithm out there?

  21. A reasonable first step... by dbolger · · Score: 2, Funny

    ...but call me when there's a tool to measure the truthiness of an article.

  22. Goddamn... by gowen · · Score: 5, Funny

    How did they pass up the chance to name this algorithm "Truthiness"?

    --
    Athletic Scholarships to universities make as much sense as academic scholarships to sports teams.
    1. Re:Goddamn... by Anonymous Coward · · Score: 0

      To be truthy I always thought "truthiness" sounded sort of "Willowy".

  23. Don't Care. by pdusen · · Score: 2, Insightful

    I might give a damn if Wikipedia editors had any actual interest in keeping articles truthful.

    1. Re:Don't Care. by shutdown+-p+now · · Score: 1

      I might give a damn if Wikipedia editors had any actual interest in keeping articles truthful.
      And you believe that none of us have such an interest? Not a single of ... I dunno... how many are there these days? several thousand on en-wiki alone?
    2. Re:Don't Care. by pdusen · · Score: 1

      I'm sure they exist, but more often than not they seem to end up on the losing side of edit-wars. Wikipedia is a cesspool of biased elitist pseudojournalists with a twisted sense of what is or is not notable, factual, or reliable info. The few decent editors there don't seem to last very long.

  24. It doesn't have to be perfect by KingSkippus · · Score: 5, Insightful

    No algorithm, except maybe personally checking every single article yourself, will ever be perfect. I suspect that the stuff you talk about will be very rare exceptions, not the rule. In fact, one of the reasons that it is so rare is because people who know what the actual truth of a matter is can post it, cite it, and show it for all to see that some common misconception is, in fact, a misconception. This is much better than, say, a dead tree encyclopedia where, if something incorrect gets printed, it will likely stay that way forever in almost every copy that's out there. (And, incidentally, no such algorithm can exist, since dead tree encyclopedias generally don't include citations and/or articles' editing histories.)

    The goal wasn't to create a 100% perfect algorithm, it was to create an algorithm that provides a relatively accurate model and that works in the vast majority of cases. I don't see any reason this shouldn't fit the bill just fine.

    1. Re:It doesn't have to be perfect by Cryophallion · · Score: 0

      I never said it had to be perfect. However, the majority of systems can be improved upon. We should never settle for "good enough" when we have the ability to enhance something. I'm sure it would work extremely well on very popular articles. I was just bringing up one possible problem, that they can possibly resolve if they are made aware of it.

      I agree with you about wikipedia being different than dead tree encyclopedias. There have been a few people online who have pointed out where wikipedia fixes mistakes major encylopedias have made.

      Looking at a lot of the comments, a number of people have the same concerns with the algorithm. Many very good things have been abused by people in the past. I like the idea, but I felt it was my responsibility to note a problem that could easily crop up.

    2. Re:It doesn't have to be perfect by duggi · · Score: 2, Insightful

      Why bother with an algorithm in the first place. Wikipedia is good for learning facts. If someone wants to know what Mary's room experiment was, they can find it. But if they want to know who did it and what kind of a person he is, should they not be referring to two or more sources? I guess the problem with credibility arises only when there is an opinion involved. It might work , sure, but when you come to know that the article is one big lie, would you not do some more research on finding out what is right? And if you see the page is clean, would you stop at that point?

      --
      http://monkeynesianeconomics.blogspot.com/
    3. Re:It doesn't have to be perfect by Colin+Smith · · Score: 1

      I suspect that the stuff you talk about will be very rare exceptions, not the rule. Not necessarily the case. All it takes is a good propagandist. Knowledge is power, if you can fool most people into believing a lie then they'll maintain the lie for you. It can take decades for the truth to come out, even if it's relatively obvious that the lie doesn't work.

      --
      Deleted
    4. Re:It doesn't have to be perfect by Anonymous Coward · · Score: 0

      In their model, an article's trustworthiness diminishes with each edit: the more people edit it, the shorter will any single edit last and the less trusted will the editors appear, even if the changes were only slight improvements to a generally accepted text. Top of the trust scale will be articles that only the submitter cares about or understands (and, ironically, edit-protected articles). This is diametrically opposed to Wikipedia's spirit of improvement through mass collaboration.

    5. Re:It doesn't have to be perfect by jackspenn · · Score: 1
      Point is if there is even a minor bias in wikipedia now (which there is) from long term posters, this method will result in that bias getting worse.

      • Plus, people who do not like the truth may want to edit it out repeatedly, if you are the truth poster your rating goes down as a result.
      • Plus by knowing the algorthim you can plan and target a individual to effectively assassinate their creditility.
      • It devalues new posters, which is not a good for a community centered concept.

      In short it is a power grab that rewards long term posters in the name of truth.
      --
      Respect the Constitution
  25. FACT Rates Trustworthiness of SLASHDOT ARTICLES by Anonymous Coward · · Score: 0
  26. I do not trust wikipedia on any "divisive issue" by Shivetya · · Score: 1

    unless it is consistent with what I already know to be true or have had time to verify against other sources.

    too many zealots rule certain categories and unfortunately too many of the same are the very powers that be.

    --
    * Winners compare their achievements to their goals, losers compare theirs to that of others.
  27. This will promote one thing by Daimanta · · Score: 2, Insightful

    Groupthink.

    --
    Knowledge is power. Knowledge shared is power lost.
    1. Re:This will promote one thing by Anonymous Coward · · Score: 1, Funny

      I agree!

    2. Re:This will promote one thing by EveryNickIsTaken · · Score: 0, Redundant

      I believe the word you're looking for is Truthiness (copyright Stephen Colbert).

  28. moderators: mod parent up ! by Anonymous Coward · · Score: 0

    I have seen much of this in science; flawed conclusions get credibility by an accelerating number of citations. This kind of software will have exactly the same effect :-(

    Also, this software means that wikipedia is moving towards the kind of "elite control" over knowledge that wikipedia was supposed to oppose (or at least supposed to be an alternative to). This alternative now slowly merges with the very problem it was supposed to evade.

  29. AfD: nn by tepples · · Score: 2, Insightful

    If I edit a history page of a small rural village near where I live, I can guarantee that it will remain unaltered. None of the five people who have any knowledge or interest in this subject have a computer. If nobody else who has a computer cares, then it's less likely that your edits can be backed up with reliable sources. In fact, people might be justified in nominating the article for deletion on grounds of lack of potential sources.
    1. Re:AfD: nn by john_lewmanny · · Score: 1

      Reliable sources are not necessarily online. They can be veriable via a trip to the library. Books and government documents are completely acceptable sources for Wikipedia.

    2. Re:AfD: nn by tepples · · Score: 1

      Reliable sources are not necessarily online. They can be veriable via a trip to the library. Books and government documents are completely acceptable sources for Wikipedia. Sources to establish notability of a subject for Wikipedia have to be not only reliable but also independent. Government documents might not be independent enough, depending on what they cover.
  30. I see two possibilities for it: by BiteMeJimbo · · Score: 0

    First, it will be too unwieldy for wide-spread use, and will fade into history. Second, it will be useful and will be incorporated into typical Wikipedia reference activity, and then the bitches of the world will start locking down articles in order to game the program.

  31. Maybe in the future by Unique2 · · Score: 2, Funny

    What we really need is some sort of algorithm that compares new information to that which is already stored. It then could test hypotheses to gain further understanding. Unfortunately a machine with enough processing power to run this "critical thinking and understanding" algorithm would be impossible to build with today's technology. We would need a new type of processor that has maybe billions of "organic neurons", it would need to be equipped with highly sophisticated sensors, a method of self transportation, self-healing and even it's own energy production system which could harvest energy indirectly from the Sun. We can only dream of such technology being available to everyone.

    --
    No trees were harmed in the posting of this message. However, a great number of electrons were terribly inconvenienced.
  32. Editors with Karma and the forth estate culling. by sjwest · · Score: 1

    Wikipedia (Jim Wales) once employed and expert who claimed to have a string of degrees but did not - the editor would say i have a degree in this and the victim would agree that the editor knew better.

    But the expert got found out

    Wikipedia and the algorithm needs to take account of persons (or organisations) who link to it get better seo - I state that 'the times of india' being good here for trashing content and putting in there own links to there site when in the 'news'.

    The wiki item in question is now useless and tells you less than before the criminal was famous. Logs, examples and other pertinent information was trashed by these media staff.

    Ideally one would re-edit the item, but why should i, and why should the seo for 'the times of india' be deemed better than my edit which was far more indepth.

    Wiki is not perfect, but news organisations serve adds. When Jim Wales promotes the times of india i do hope he get paid something by them.

    This is not sour grapes - but editors, idiots, and seo ops might make there own link farm ghetto - 'the times of india',the bbc, cnn etc. Journalists can write (say english) it does not mean they know much about the topic they discuss in english.

    While the tool has good intentions my use of the wiki as a source has stopped and no i do not purchase the 'times of india' either.

    The senario i discuss happens with the newly famous. That is when the profession seo'ers at the 'the times of india' rush in and mass delete.

    Not my problem, but i do hope Jim gets paid by the 'times of india'.

  33. Should be called "stability" by Random832 · · Score: 2, Insightful

    "trustworthiness" doesn't enter into whether something gets edited out, for precisely the same reason a need for this is perceived at all: it can be edited by anyone!

    --
    We've secretly replaced Slashdot with new Folgers Crystals - let's see if it notices.
  34. Changes to reputation calculation can be expected by InfiniteRandomChaos · · Score: 0

    This definitely is a step in the right direction. A few years later, reputation calculation could become more complex based on profiles of the individual on the entire net. Who knows!

  35. If a scoring system is created it will be gamed by Anonymous Coward · · Score: 0

    .. and it strikes me that every way in which a system like that could be gamed are quite horrible and damaging.

  36. Nah, Wikipedia is going the way of Slasdot by Anonymous Coward · · Score: 0

    This is nothing more than a karma system where loud masses rule viewpoints.

    The shark's in the water, and Wikipedia is up on their skis....

  37. Cut and paste edits by benhocking · · Score: 1

    I've seen some very large, profanity-laden edits. Many of these more than double the size of the article. These are not the majority of vandalisms, but they are a significant fraction.

    --
    Ben Hocking
    Need a professional organizer?
  38. Nice idea, but ... by kranberry · · Score: 1

    what if I am very trustworthy but can't spell or use proper grammar?

    1. Re:Nice idea, but ... by ch-chuck · · Score: 1

      You can always join the Boy Scouts. But then you'd have to be honest, loyal and some other stuff.

      --
      try { do() || do_not(); } catch (JediException err) { yoda(err); }
  39. all I need to know... by Anonymous Coward · · Score: 0

    ..about "trustworthy-ness" comes from articles and opinions like this.

    "Bully for him, and Wikipedia nay-sayers be danged. Some of that massively democratic participatory media can get pretty funny, and teach us more about human nature than dull, non-participatory text."

    Yes, those boring things like facts, who needs them, let alone an encyclopedia? What I really want is to be entertained. Let's invest in this huge experiment on "human nature" at the expense of truth and knowledge. Hell, it's not like anyone over 20 years old or who hasn't lived on a desert island doesn't already know about "human nature".

  40. Isn't this old news? by ta+bu+shi+da+yu · · Score: 1

    For a site that prides itself on being up their with announcing new things, this is really pretty much old news.

    --
    XML is like violence. If it doesn't solve the problem, use more.
  41. Algorithm doesn't prove what it thinks it does by MSTCrow5429 · · Score: 2

    That algorithm is a model that does not match real world data. It might be useful to measure who has protection from the bureaucracy, but it won't and can't decipher how true something is simply by how many times and at what frequency people scribble over it. This algorithm is psuedo-scientific, by assuming a premise without investigating the veracity of said premise, and then running away with it as if it were a proven one.

    --
    Slashdot: Playing Favorites Since 1997
  42. It's progress over edit counts by Animats · · Score: 2, Interesting

    One big problem with Wikipedia has been that editor status, and promotion to "adminship", is based on edit counts, the number of times someone has changed something. The editors with huge edit counts don't generally write much; it takes too long. Slashdot karma is a more useful metric than edit counts, but Wikipedia doesn't have anything like karma.

    I'd suggested on Wikipedia that we needed a metric for editors like "amount of new text that lasted at least 90 days without deletion". This UCSC thing is a similar metric.

    1. Re:It's progress over edit counts by The+One+and+Only · · Score: 1

      I don't think that's been a pressing or urgent problem for a long time--it's more common for adminship candidates to be judged based upon the number of Featured Article development pushes they've been involved in than upon edit count, and it has been for months if not years.

      --
      In Repressive Burma, it's not just your connection that dies. slashdot.org/comments.pl?sid=314547&cid=20819199
  43. And now it'll no longer work by Racemaniac · · Score: 1

    now the algorithm is publicly known, anyone can make pages that abuse it somehow ^^ and then they can update it to counter those abuses etc... let the war begin!

  44. More important by Schnoogs · · Score: 0

    How trustworthy is this algorithm

  45. algorithm by Anonymous Coward · · Score: 0

    So if the Wikipedia editors, US government, Fox news, spend all their time keeping pages they way they want it by editing user contributions, they automatically become the reliable source?

    1. Re:algorithm by Synic · · Score: 2

      No, actually you could use a dummy account to smear anyone's reputation by constantly re-editing their page back to whatever you want. By fighting over the content of a page, you effectively decrease the reputation of both parties. Since the dummy account isn't a real person, you are safe to throw it away after you are finished.

  46. Never confuse popularity with factual truth by presidenteloco · · Score: 2, Insightful

    They might be somewhat correlated, on a statistical basis, over
    many cases, but there are many individual cases and times
    when the currently popular view is wrong and the lone
    wolf opinions are later proven to have been correct.

    This algorithm would seem to be more of a popularity contest
    than a truth finder. I think we have to be very wary of
    the truth by mass agreement theory.

    Hint: Remember the "weapons of mass delusion" ?
    I bet someone commenting that the US government is lying
    through their teeth about it would have been re-edited
    pretty quick.

    --

    Where are we going and why are we in a handbasket?
  47. all articles contain accurate Anime information by Anonymous Coward · · Score: 0

    no matter how loosely related the subject is to Anime.

    many have good information about references in the "Animaiacs" cartoon too, which will come in handy when writing your term paper.

  48. Trustworthiness? NO. by Anonymous Coward · · Score: 0

    Trustworthiness? NO.

    Compliance with the dominant group view? YES.

    We're well on the road to a society full of truthiness when people refer to passing groupthink tests as a measure of "trustworthiness".

  49. Re:Tyranny of the minority by Anonymous Coward · · Score: 0

    ... is even worse ...

    For some reason, the phrase "tyranny of the majority" causes people to suspend disbelief. As is evidenced by your post being marked insightful...

  50. Massive truncation of edit histories by FeatureBug · · Score: 1
    I guess you meant "But other admins can still see it [any edit deleted by an administrator]"?

    Is ordinary admin (non-oversight) deletion used frequently compared to oversight deletion? I've seen articles where the entire edit history before a certain date containing several years' worth of edits was erased.

    What could be causing some edit histories to get out of chronological order as mentioned in this post.

    1. Re:Massive truncation of edit histories by Veinor · · Score: 1

      Non-oversight deletion is used much more frequently than oversight deletion, yes. For example, if someone copy-pastes part of something from another website, that's usually just admin-deleted. Oversight happens when the legal counsel advises it. I have no idea what could cause the entire edit history to be deleted, though; do you happen to know any specific ones offhand? As for the out-of-order thing, I don't know. Doesn't seem to have been deleted (though maybe there was some oversight; I don't have that).

  51. PageRank by 12357bd · · Score: 1

    The algorithms looks very similar to the Google's Pagerank. Take edition time as inverse of links to/from, and the whole concept looks very similar. The question is, PageRank was terribly biased once people started to automate cross linking, will this algorithm performs better against biased editors?

    --
    What's in a sig?
  52. Hmmmmmmm by mcmonkey · · Score: 1

    2) Wikipedia is not a source for academic research, and never will be.

    Your comment got me to thinkin'. (and on a Friday! Damn you!)

    The big thing in academic research is peer review, and what is Wikipedia but the extension of peer review to the larger community? I'm certainly not a fanboi and don't use Wikipedia as a source for anything work related, but I'm not too quick to add "never" to the end of that statement.

    When I go to a peer-review journal, either as a source for research or an outlet of publication, I am looking for two things. First I hope the community is right in trusting the journal. This is like the 'many eyes' program for weeding out bugs in open source software. And it's not just the lack of the community rising to refute the publication, but also the number of citations. Trust is demonstrated when members of the community continue to say, this is a source I trust and _use_.

    Second I hope that trust transfers to me (or my research). The foo community trusts the Journal of Foo Letters, so when I reference that source, I least get the benefit of the doubt that I am starting from sound principles. Likewise, when my work is published in JFL, even those who disagree with my conclusions must either confer upon me a certain extent of respect or question the standing of JFL in the community.

    The difference is, JFL has gained the trust of the foo community by being run members of that community. Its reviewers are foo experts in good standing who each have a history of original work. I go to Wikipedia for my foo research, and I don't get that level of trust.

    However we've already seen a movement towards openness in the peer-review journal community. What's prevents Wikipedia setting up a foo area moderated by a panel of foo experts known to the foo community? If such a thing existed, and could still be called Wikipedia, I see no reason it could not be used as a source for academic research.

    1. Re:Hmmmmmmm by skoaldipper · · Score: 2, Insightful

      What's prevents Wikipedia setting up a foo area moderated by a panel of foo experts known to the foo community?
      Define experts.

      Wikipedia does an extroadinary job from a wide variety of peer resources, both professional and layman alike. So called "experts" like academia are just as political in their research and analysis as well - specifically, in the social sciences. Peer review never really amounts to much more than a consensus, but not necessarily an accurate one. Objectivity is the holy grail which I don't think will ever be achieved whether in Encyclopedia, Wikipedia, or newspaper for that matter. The objectivity is best left to the reader, as well as the research, imho.

      What you're asking for is really nothing more than some sort of certification, which most use as nothing more than back patting for their particular opinion. I say, take an Encyclopedia or Wikipedia for what it is, and just move on to the next.
      --
      I hope, when they die, cartoon characters have to answer for their sins.
    2. Re:Hmmmmmmm by Anonymous Coward · · Score: 0

      Actually, there's a kind of 'expert Wikipedia' in the works: Citizendium. I'm not saying it'll work, but it's worth keeping an eye on. :)

      http://en.wikipedia.org/wiki/Citizendium

    3. Re:Hmmmmmmm by xappax · · Score: 1

      The big thing in academic research is peer review, and what is Wikipedia but the extension of peer review to the larger community?

      Wikipedia does use peer review, but it's a different kind than what we see in the academic community. If something is peer reviewed in Wikipedia, it means that other people are able to confirm that all the listed information has been published in reliable sources. "Verifiability, not truth" as they say. If something is peer reviewed in the scientific community, it means that other people have made direct observations that support your hypothesis. Basically, it doesn't matter what's published or which "reliable sources" claim otherwise, the scientific method yields to the information that can be confirmed by observation.

      I'm certainly not a fanboi

      I am, actually - I think Wikipedia is one of the greatest things to emerge from the internet, but the reason I think that is because I realize what it is - and is not.

      What's prevents Wikipedia setting up a foo area moderated by a panel of foo experts known to the foo community?

      I mean, you could set up a phpbb board with moderator access for the Physics department and call it "Wikipedia Physics", but it's just a name. The entire, central point of Wikipedia is that anyone can contribute and (at least theoretically) anyone can administrate. That central point is responsible for the phenomenal growth and participation in Wikipedia, wheras more restrictive media sit stagnant.

      I think the difficulty is that people conflate "Wikipedia" with "wikis". There could be (and actually are many) academic wikis, made available to scientists and researchers to review and share each other's ideas. These wikis are a great step away from the "journal-industrial complex", but they're not Wikipedia because not everyone can participate. And if they wish to gain the prestige and blind faith currently afforded scientific journals, they should probably stay that way.

      Wikipedia is able to invite everyone to participate because the bar for participation is so low. You don't have to discover anything, or even know anything. All you have to do is find information from other sources and aggregate it in a summarized article - sort of like Cliffs Notes for everything. Similarly, the litmus test for what's allowed to be on Wikipedia is much simpler than in academia. If someone writes something questionable you just ask "Source?". If there are some, OK, if not, deleted. If the expectations were more complex than that, it'd be impossible to manage while maintaining universal participation.

  53. WIKI's top 100 by malilo · · Score: 1

    Does anyone else find this list hilarious? http://hemlock.knams.wikimedia.org/~leon/stats/wik icharts/index.php?wiki=enwiki&ns=articles&limit=10 0&month=08%2F2007&mode=view WikiCharts -- Top 100 -- 08/2007 Views per day Percent Title 389 659 ± 1% 3.7114% 1. Main Page 17 773 ± 3% 0.1693% 2. Harry Potter and the Deathly Hallows 11 368 ± 4% 0.1083% 3. Wiki 10 995 ± 4% 0.1047% 4. Harry Potter 9 649 ± 4% 0.0919% 5. Transformers (film) 5 286 ± 6% 0.0504% 6. Naruto 5 173 ± 6% 0.0493% 7. Wikipedia 4 427 ± 6% 0.0422% 8. Deaths in 2007 4 119 ± 6% 0.0392% 9. United States 3 827 ± 7% 0.0365% 10. Harry Potter and the Order of the Phoenix (film) 3 714 ± 7% 0.0354% 11. Sex 3 616 ± 7% 0.0344% 12. List of sex positions 3 584 ± 7% 0.0341% 13. Hypertext Transfer Protocol 3 535 ± 7% 0.0337% 14. The Simpsons 3 519 ± 7% 0.0335% 15. YouTube 3 486 ± 7% 0.0332% 16. Bleach (manga) 3 422 ± 7% 0.0326% 17. Guitar Hero III: Legends of Rock 3 227 ± 7% 0.0307% 18. List of characters in the Harry Potter books 2 838 ± 8% 0.0270% 19. The Simpsons Movie 2 789 ± 8% 0.0266% 20. List of Konoha ninja 2 789 ± 8% 0.0266% 21. List of Akatsuki members 2 773 ± 8% 0.0264% 22. Optimus Prime 2 692 ± 8% 0.0256% 23. Harry Potter and the Half-Blood Prince 2 676 ± 8% 0.0255% 24. Seven Wonders of the World 2 514 ± 8% 0.0239% 25. Chris Benoit 2 449 ± 8% 0.0233% 26. Harry Potter (character) 2 416 ± 8% 0.0230% 27. 50 Cent 2 368 ± 8% 0.0226% 28. Megatron ... so all wiki users are nerdy harry potter fans interested in sex? ha!

    --
    "sometimes he felt that his whole life was a dream, and he wondered whose it was and whether they were enjoying it."
  54. Do they track popularity of topics? by Bluesman · · Score: 1

    I could submit nonsense on a variety of obscure topics, with low odds that anybody will find and correct them, thereby building up a great reputation. I wonder if their system accounts for that.

    This is starting to sound like Karma for wikipedia.

    --
    If moderation could change anything, it would be illegal.
  55. I found another flaw in the algorithm by presidenteloco · · Score: 1

    On reviewing the demo, it would seem that the untrusted i.e. frequently
    changed sections are essentially the sections that people care about
    or (in our present society) have more knowledge about or insight into,
    so people want to tweak those sections.

    So ironically, the algorithm will flag as untrustworthy the most relevant
    sections of articles. The "don't know - don't care" parts will be virgin
    white and pure.

    --

    Where are we going and why are we in a handbasket?
  56. Sounds familliar by COMON$ · · Score: 1
    obg quote references: (Thanks for most of them goes to www.av8n.com)

    We will never make a 32-bit operating system, but I'll always love IBM. -Gates

    What, sir, would you make a ship sail against the wind and currents by lighting a bonfire under her deck? I pray you, excuse me, I have not the time to listen to such nonsense. - Napoleon

    I watched his countenance closely, to see if he was not deranged ... and I was assured by other senators after he left the room that they had no confidence in it. - U.S. Senator Smith of Indiana, after witnessing a demonstration of Samuel Morses's telegraph

    Well-informed people know it is impossible to transmit the voice over wires and that were it possible to do so, the thing would be of no practical value. Boston Post

    Louis Pasteur's theory of germs is ridiculous fiction.Pierre Pachet, professor of physiology at Toulouse

    Radio has no future. - Lord Kelvin

    The ordinary "horseless carriage" is at present a luxury for the wealthy; and although its price will probably fall in the future, it will never, of course, come into as common use as the bicycle. Literary Digest

    I have traveled the length and breadth of this country and talked with the best people, and I can assure you that data processing is a fad that won't last out the year. The editor in charge of business books for Prentice Hall.

    We don't like their sound. We don't think they will do anything in their market. Guitar groups are on their way out. Decca Recording Co., declining to sign the Beatles

    "There is no reason anyone would want a computer in their home." Ken Olson

    There are several other fun ones that I am sure people will post but the idea of saying never is a hard prophecy to live up to. It is also very hard to verify quotes without use of a wiki or a network. But I will include this excerpt to refute your "claims to be" remark Wikipedia's founder, Jimmy Wales, says he wants to get the message out to college students that they shouldn't use it for class projects or serious research. -http://chronicle.com/wiredcampus/article/1328/wik ipedia-founder-discourages-academic-use-of-his-cre ation

    --
    CS: It is all sink or swim...oh and did I mention there are sharks in that water?
  57. MOD PARENT UP by Anonymous Coward · · Score: 0

    Nothing but personal direct observation can be an authoritative source of anything, and even that can fail.

  58. Compliance vs Compression by Baldrson · · Score: 1

    This algorithm is measuring compliance with the Wikipedia dispute processing norms -- not "trustworthiness". A better measure of "trustworthiness" of a passage is its consistency with the rest of the body of human knowledge -- which is most strictly measured by the degree to which it is not a special case within a compressed representation of that knowledge. This is the basis of the Hutter Prize for Lossless Compression of Human Knowledge. The Hutter Prize is currently using a 100M sample from Wikipedia as its corpus.

  59. Re:I do not trust wikipedia on any "divisive issue by Xtifr · · Score: 1

    > unless it is consistent with what I already know to be true

    Absolutely. I keep trying to replace all their lies about quantum mechanics with my truth about the Electro-Flux Aether and Spiritual Gravitation, and I keep getting reverted.

    > or have had time to verify against other sources

    Ah, so you do understand how Wikipedia should be used. Good on yer, mate. :)

    > too many zealots rule certain categories

    Yeah, like those bastards who keep trying to insist that the Holocaust actually happened, that evolution is a scientific fact, and that the Earth goes around the Sun. I mean, have you ever heard anything so preposterous? I'd rather get my information from more reliable sources. Like Fox News and the New York Times. And Slashdot. :)

  60. Spelling Mistakes? by logicnazi · · Score: 2, Insightful

    What I want to know is if it is smart enough to distinguish edits that correct spelling and grammar mistakes from those that change content.

    In particular I'm worried that the system will undervalue the information from people whose edits are frequently cleaned up by others even if that content is left unchanged.

    --

    If you liked this thought maybe you would find my blog nice too:

  61. Trustworthiness v1.0b by Anonymous Coward · · Score: 0

    #!/usr/bin/perl

    open(WIKIPEDIA, "<", "wikipedia.dump") or die "Unable to find garbage file - $!\n";

    while (<WIKIPEDIA>) {
      ++$crap;
      ++$nonsense;
      ++$spam;
    }

    print "Not very trustworthy.\n";

  62. Pseudonyms? by chris_sawtell · · Score: 1

    Wouldn't one find an entry, for example, about the early history of the WWW by the genuine Sir Tim Berners-Lee to be considerably more trustworthy than one by signed by some anonymous WikiWonderBoy?

    I don't think the algorithm takes that into account.

    1. Re:Pseudonyms? by Jotii · · Score: 1

      This reminds me of an essay I was writing for school. We had been told that Wikipedia is an acceptable source as long as the contributors seem trustworthy, e.g. with a full name instead of a pseudonym. At the relevant page, the majority of the contributions were by a reputable Wikipedia administrator: Can't sleep, clown will eat me

      --
      [sig]
  63. Other uses in software development by philcolbourn · · Score: 1

    It could also be useful for the trust worthiness of code from various authors! - Poorly written/designed code will probably have more faults and require more edits.