Algorithm Rates Trustworthiness of Wikipedia Pages
paleshadows writes "Researchers at UCSC developed a tool that measures the trustworthiness of each Wikipedia page. Roughly speaking, the algorithm analyzes the entire 7-year user-editing-history and utilizes the longevity of the content to learn which contributors are the most reliable: If your contribution lasts, you gain 'reputation,' whereas if it's edited out, your reputation falls. The trustworthiness of a newly inserted text is a function of the reputation of all its authors, a heuristic that turned out to be successful in identifying poor content. The interested reader can take a look at this demonstration (random page with white/orange background marking trusted/untrusted text, respectively; note "random page" link at the left for more demo pages), this
presentation (pdf), and this paper (pdf)."
Someone should make a wikipedia entry for this algorithm to see how trustworthy it is.
More
>If your contribution lasts, you gain 'reputation,' whereas if it's edited out, your reputation fails
...
And the editor wars start
Washington bullets will simply be known as the "Bulle
It'd be nice if it could be generalised to other sites...
Deleted
Every paper touting automatic adjustments for gaming the system becomes obsolete the moment it is published.
(Godwin didn't publish this, but I might get around to editing his Wikipedia entry to say that he did).
I've been noticing some of the edit histories for articles that are 5 years old on Wikipedia stop well before 5 years ago. Were some of the edit histories been lost or deliberately truncated?
So, if there is a myth that a lot of people believe is true, then it will stay up there as it is not challenged. So, it still gets reputation, and therefore more credibility, making it more likely that the myth will be perpetrated.
Also, if someone hasn't noticed something that is wrong on an esoteric entry, it will also be given credibility, and once again be more likely to be considered to be fact.
While you could add voting to the algorithm to have people vote on whether it is true, that still gets destroyed by someone who just votes because they think it's true, not because they have verified it.
Either way, it potentially gives additional credibility to something that may be very wrong.
Seems to work, the entire page turned orange.
+0 Meh
They should just call it wiki-karma.
It appears they include #REDIRECT pages; the very first page the random link took me to was Cheliceriformes, with the #REDIRECT line in orange. Seems an easy way to gain trust, once a redirect is created it is hardly ever changed.
Does it take into account magnitude of error corrections? If major portions of someone's articles are being rewritten, that's a good reason to de-rep them. If someone makes a bunch of minor spelling or trivial errors, then that's not necessarily a reason to do so.
And, of course, there is the potential for abuse. If the software could intelligently track reversions and somehow ascribe to those events a neutral sort of rep, that would probably help the system out.
As it stands, they're essentially trying to objectively judge "correctness" of facts without knowing the actual facts to check. That's somewhat like polling a college class for answers and assigning grades based on how many other people DON'T say that they disagree with a certain person in any way.
the relative controversy of the item being edited.
If I edit a history page of a small rural village near where I live, I can guarantee that it will remain unaltered. None of the five people who have any knowledge or interest in this subject have a computer.
If I edit an item on Microsoft attitude to standards, or the US occupation of Iraq, I'm going to be flamed the minute the page is saved, unless I say something so banal that noone can find anything interesting in it.
But my Microsoft page might be accurate, and my village history a tissue of lies....
Sounds like a worthy start to the process of introducing more trustworthyness into Wikipedia entries, but this maybe needs tuning for content type too.
Afterall just because someone is a reliable expert at editing the wikipedia entries on Professional Wrestling or Superheroes doesn't necessarily mean we should trust their edits on, for instance, the sensitive issues of Tibetan sovereignty.
erroneous: look me up in a dictionary
I realize that an encyclopedia by definition will always emphasize the established majority opinion about any given subject. But it seems that this tool might strengthen majority opinions beyond what is reasonable. If you happen to edit an article by adding valid but unpopular dissenting points of view, and the other contributors are sufficiently boneheaded, you lose karma (or whatever the tool calls it) for no good reason. This might then easily develop a life of its own, and you are screwed.
"When I first heard Daydream Nation it quite frankly scared the living shit out of me." -- Matthew Stearns
Although this method will certainly help filter pranks and cranks, it won't help if the "consensus" among wikipedia authors is wrong. If a true expert edits a page, but the masses don't agree with the edit, they will undo the expert's addition and give the expert a low reputation. Thus, the trust rating becomes a tool for maintaining erroneous, but popular ideas.
That said, I can't help but believe that this tool is a net positive because it makes points of debate more visible. One could even argue that it literally highlights the frontiers of human knowledge. That is, high-trust (white) text is well known material and highlighted (orange) text represents contentious or uncertain conclusions.
Two wrongs don't make a right, but three lefts do.
...but call me when there's a tool to measure the truthiness of an article.
How did they pass up the chance to name this algorithm "Truthiness"?
Athletic Scholarships to universities make as much sense as academic scholarships to sports teams.
I might give a damn if Wikipedia editors had any actual interest in keeping articles truthful.
No algorithm, except maybe personally checking every single article yourself, will ever be perfect. I suspect that the stuff you talk about will be very rare exceptions, not the rule. In fact, one of the reasons that it is so rare is because people who know what the actual truth of a matter is can post it, cite it, and show it for all to see that some common misconception is, in fact, a misconception. This is much better than, say, a dead tree encyclopedia where, if something incorrect gets printed, it will likely stay that way forever in almost every copy that's out there. (And, incidentally, no such algorithm can exist, since dead tree encyclopedias generally don't include citations and/or articles' editing histories.)
The goal wasn't to create a 100% perfect algorithm, it was to create an algorithm that provides a relatively accurate model and that works in the vast majority of cases. I don't see any reason this shouldn't fit the bill just fine.
Groupthink.
Knowledge is power. Knowledge shared is power lost.
What we really need is some sort of algorithm that compares new information to that which is already stored. It then could test hypotheses to gain further understanding. Unfortunately a machine with enough processing power to run this "critical thinking and understanding" algorithm would be impossible to build with today's technology. We would need a new type of processor that has maybe billions of "organic neurons", it would need to be equipped with highly sophisticated sensors, a method of self transportation, self-healing and even it's own energy production system which could harvest energy indirectly from the Sun. We can only dream of such technology being available to everyone.
No trees were harmed in the posting of this message. However, a great number of electrons were terribly inconvenienced.
"trustworthiness" doesn't enter into whether something gets edited out, for precisely the same reason a need for this is perceived at all: it can be edited by anyone!
We've secretly replaced Slashdot with new Folgers Crystals - let's see if it notices.
Whereas the implementation of "+1 funny" will be the end of the information age.
If you mod this up, your slashdot background will turn into a beautiful sunset!
That algorithm is a model that does not match real world data. It might be useful to measure who has protection from the bureaucracy, but it won't and can't decipher how true something is simply by how many times and at what frequency people scribble over it. This algorithm is psuedo-scientific, by assuming a premise without investigating the veracity of said premise, and then running away with it as if it were a proven one.
Slashdot: Playing Favorites Since 1997
One big problem with Wikipedia has been that editor status, and promotion to "adminship", is based on edit counts, the number of times someone has changed something. The editors with huge edit counts don't generally write much; it takes too long. Slashdot karma is a more useful metric than edit counts, but Wikipedia doesn't have anything like karma.
I'd suggested on Wikipedia that we needed a metric for editors like "amount of new text that lasted at least 90 days without deletion". This UCSC thing is a similar metric.
They might be somewhat correlated, on a statistical basis, over
many cases, but there are many individual cases and times
when the currently popular view is wrong and the lone
wolf opinions are later proven to have been correct.
This algorithm would seem to be more of a popularity contest
than a truth finder. I think we have to be very wary of
the truth by mass agreement theory.
Hint: Remember the "weapons of mass delusion" ?
I bet someone commenting that the US government is lying
through their teeth about it would have been re-edited
pretty quick.
Where are we going and why are we in a handbasket?
Wikipedia does an extroadinary job from a wide variety of peer resources, both professional and layman alike. So called "experts" like academia are just as political in their research and analysis as well - specifically, in the social sciences. Peer review never really amounts to much more than a consensus, but not necessarily an accurate one. Objectivity is the holy grail which I don't think will ever be achieved whether in Encyclopedia, Wikipedia, or newspaper for that matter. The objectivity is best left to the reader, as well as the research, imho.
What you're asking for is really nothing more than some sort of certification, which most use as nothing more than back patting for their particular opinion. I say, take an Encyclopedia or Wikipedia for what it is, and just move on to the next.
I hope, when they die, cartoon characters have to answer for their sins.
What I want to know is if it is smart enough to distinguish edits that correct spelling and grammar mistakes from those that change content.
In particular I'm worried that the system will undervalue the information from people whose edits are frequently cleaned up by others even if that content is left unchanged.
If you liked this thought maybe you would find my blog nice too:
No, actually you could use a dummy account to smear anyone's reputation by constantly re-editing their page back to whatever you want. By fighting over the content of a page, you effectively decrease the reputation of both parties. Since the dummy account isn't a real person, you are safe to throw it away after you are finished.