Wikipedia 2.0, Now With Trust?
USB EVDO writes "The online encyclopedia is set to trial two systems aimed at boosting readers' confidence in its accuracy. Over the past few years, a series of measures aimed at reducing the threat of vandalism and boosting public confidence in Wikipedia have been developed. Last month a project designed independently of Wikipedia, called WikiScanner, allowed people to work out what the motivations behind certain entries might be by revealing which people or organizations the contributions were made by. Meanwhile the Wikimedia Foundation, the non-profit that oversees the online encyclopedia, now says it is poised to trial a host of new trust-based capabilities."
The problem is that many other reference sites on various topics, developed privately by informed and qualified individuals, have now folded since the maintainers thought Wikipedia superseded hosting such information on one's own website. And now, such information on Wikipedia can be vandalized at any moment right before someone would go look at the page, and kooks can twist the page to their own ends.
Troll-proofing a reference site (as opposed to a casual forum like /.) without a paid staff is laughable, it's just a good-sounding measure to pacify a particular market (Germany in this case). It will be easy enough for either pranksters or marketers/scammers to figure out and workaround whatever provisions they set up... also there will be a black market for people who have established the creds to get it done.
1. Pay contributors, i.e., give them revenue. Even micro-payments will do, pennies. (The added side-benefit of this is that it means contributors will most likely need paypal accounts, which most likely means they will be "of age:" No more changing entries as result of bets made in the back of the school bus.)
2. Fire contributors who screw up, depriving them of that revenue.
3. Problem solved.
Anything else is a hippy-dippy feel-good buzz-word Web-X-point-something-or-other that begins with the letter "cluster."
Common, even /. is more trustfull. Trust is not something you can buy with another set of features.
root of all...
But this is exactly what an encyclopedia is for, to get a basic overview and pointers to the real sources. Have people forgotten that, or were there a lot of people out there using Brittannica as a primary source?
And yet the simplest and most effective quality control, requiring registration, is still considered sacrilege to the Wikipedia overlords...
I tried adding something once to an article but they kept bludgeoning me and removing it due to that it wasn't referenced. I did reference it to a reliable source but I put it in a "External Links" as I couldn't add it to the citations/sources without being a registered user for some reason. If I have to become a registered user to add a citation, and if I have to add citations to add things without them being automatically deleted (regardless of their merit), that destroys a lot of anonymity. Which may be good or bad depending on your POV.
Wikipedia is pretty good as a resource in my experience, but lately they have been obsessed with being SEEN as accurate and are implementing rules that get them SEEN as accurate but I don't know if the actual result is that they become more accurate or just more orthodox and accepted by the establishment. They have been already shown in a study to be as or more accurate than Encyclopedia Brittanica - I think the direction they are heading actually does not lead them toward their ideal (accuracy) but more toward the mob rule/(orthodox accepted truths).
Well, anyone who reads self-help books has a problem with understanding reality, let alone truth. Let's examine this wishy-washy new age idea that truth is a consensus consisting of a lot of compromises. I think that this idea is completely flawed on every level. You obviously do not. What consensus do we reach; that it's only partly a bunch of shit?
To back your point up you mention that things like "history" work less well than things like "thermodynamics". Do you really believe this is because people understand each other's views on science subjects more than arts subjects? That a consensus position can more easily be reached?
The basic problem with this theory of truth by consensus is that it assumes that truth is not discrete, and it can be reached by majority voting. In many subjects truth is discrete, and the voting model is closer to winner-takes-all. The reason that the truth crystallizes in this manner is because it is objectively testable. This is why we refer to the set of things that behaves in this manner - science. That which can be studied by the scientific method.
Furthermore, I think that you have a fundamental misunderstanding of what wikipedia's purpose is. It has very explicit design goals, using your terms, it attempts to construct articles that have all of the known facts. That it, is ignores "understanding" as you put it, or POV as wiki puts it. If a fact can be attributed to a respectable source then it goes in. Understanding is left as an exercise for the reader.
You miss the point that wiki is better for science, because in terms of establishing what the facts are, science subjects are the low hanging fruit. History (for example) is harder because the facts are not always in an objectively testable form, and usually have to pried from subjective observation. An ideal wikipedia article is not a "compromise" between all of the opinions that went into it - it is a collection of all of the facts that could be verified regardless of whether or not the contributors agreed upon them.
Slashdot: where don knuth is an idiot because he cant grasp the awesome power of php
I like De Alfaro's statistical approach of ranking both blocks of text and editors.
:-)
I also like the approach of checking IP addresses, although I was caught in that: earlier this year I added an article on machine learning, but someone from my ISP had done vandalism; I was blocked for a few days until I went through their system; no problem, just a delay.
The whole topic of trust is a very interesting problem, one that also occurs on web sites, the semantic web, etc. (Imagine trying to perform reasoning with RDF on the web when some contains fake information).
I (slightly) embarrassed myself last night by sending a link to a parody article to a few friends and family, not realizing that it was a parody - I had to send out a "never mind" email this morning.
I have mixed feelings about private anonymous use of the web vs. the benefits to knowing who people are. I very recently turned off anonymous posting on my web blog - too many anonymous posts offered opinion that I doubt the posters would express if they represented themselves.
As an open platform (hopefully forever), the Internet will evolve in interesting ways
I would be very surprised (and a little annoyed) if they don't use this as the basic mechanism behind their validation scheme. This preserves the freedom of editing, and greatly decreases the probability that somebody reading Wikipedia will see a vandalized/substandard version of an article. Rather than merging changes from one branch to the other, like in software development, however, I think WP would be better off tagging a version of an article as stable, and keeping the latest version as unstable.
The main problem is who decides when an article or section should go stable? This is where the complicated algorithms come in. One of the most important principles of wikipedia is that authority counts for absolutely nothing. People complain that wikipedia makes no use of experts, but that's not true. It simply will not view additions by experts just because they are experts. Everybody is equal. This should be reflected in the validation scheme. So many proposals have teams of fact checkers and domain experts, which is very much unlike Wikipedia. An automated trust network (like the one described in the article) should be used to assign contributors a trust rating, and then let people vote on the validity of an article or section.
I should also point out that none of this is new. Most of these ideas have been in the pipeline for years. Check out http://meta.wikimedia.org/wiki/Article_validation_proposals#Automated_Trust_Networks for a list of proposed validation schemes.
If you had to pay to edit Wikipedia, only the serious editors would do it.
[sig]
Except for the notability crackdown. Unless the 5th season of Buffy is notable in some way, articles about it will probably be deleted with prejudice. I used to go to wikipedia to read trivia about every single episode of Futurama, but they've started cracking down on that; if a TV episide hasn't been nominated for an award, you might not be able to find it on WP in the coming future. (There are other possible reasons for something to be considered notable besides nominations, of course). "Trivia" sections are being removed from articles; long articles about "uncommon" subjects are being replaced with short summaries; articles that don't affirm their own notability will get speedily deleted; and, articles without adequate citations or good references will be tagged for future removal.
Some editors say, "No, that's exaggerated---we rarely delete things!" That may have been true a few years ago, but that is not the currently policy. I've studied the recent editing guidelines and asked numerous questions in <irc://irc.freenode.net/#wikipedia>. Search wikipedia for something that doesn't exist; now, the 'not found' page has a new line, something like: the article may have been deleted for not meeting quality standards. Open an edit window for an article that doesn't currently exist; there are now multiple boldface warnings about certain things being candidates for speedy deletion. I'm afraid to contribute anything anymore. If I really feel like there's an important fact missing from an article, I'll try to visit a local college library and come up with some good sources, but, I wouldn't dare create a new article, because I know it would have little hope of surviving unless an editor happened to feel like looking for references instead of hitting delete.
"Imagine everyone having access to all of human knowledge^W^W^W^Wonly the stuff we've deemed notable and non-frivolous. That's what we're trying to do."
I used to resent the Wikipedia-Watch referring to the editors, arbiters, and overseers as a "hive mind," but, the recent policy changes have made increased the likelihood of a hivemind emerging.
There's also a new system where there are a few overseers-to-the-overseers who can make an article or particular edit be deleted without showing up in the history log or deletion log; it's supposed to be reserved for the removal of private information, specifically in instances of the "right-to-disappear" and "right-to-anonymity" systems which allow an editor to protect their identity, if they so desire, when necessary. This can lead to strange situations where one user can make another user appear to be a vandal by careful manipulation of a page and posting of private information through multiple accounts; then, looking at the edit history diffs for a page can make vandalism appear to be caused by someone that it isn't, since certain edits are completely hidden. There's no way for anyone but the overseers-to-the-overseers to be able to tell if these kinds of nearly-invisible changes have even occurred on a certain article (or, at least, the pages outlining this policy seem to indicate this; whether some pages and edit histories might have a "Notice: some revisions are hidden to protect certain individuals' privacy, and some diffs may be inaccurate" notice somewhere on them is not documented, although I would hope that sort of notice will be added if it doesn't already exist.)
I also wish that deleted articles could be viewed by the general public if they so choose. I understand that sometimes deletion is used in cases of illegal content, but, what about the perfectly-legal but uncited or non-notable deletions?
There are also two database admins who have the power to do anything at all without leaving an audit trail (Jimmy Wales and one of the lead Wiki code developers, iirc), which is a little scary. It seems to go against the ideas that WP is supposed to stand for (opening edit model, visible history, etc). I only hope that it's usage is severely limited and that some am
Uhh, you just proved the GP's point. He didn't say HOW consensus was reached, just that it was ruled by consensus.
The whole 'notability' requirement was one that really irritated me. I would have thought 'usefulness' was a much better standard for an encyclopaedia with no real size constraints. If a page is getting hits from people reading it, then it should count as sufficiently notable to remain, and not be deleted because it doesn't meet someone's standards for important. It's not like it's wasting shelf space...
I am TheRaven on Soylent News
One thing I wish Wikipedia would do is cache the citations; if the citations are made to a website. I've noticed a slightly out-of-date wiki entry would usually have a good majority of their citations lead to pages that no longer exist. I'm sure there are legal and technical issues that make it difficult, however, transparency of works cited is crucial.
WikiScanner does NOT allow people to tell the "motivations" of those who make changes... it simply identifies those parties (in some cases), and other people draw their own conclusions. Those are not the same things.
Further, WikiScanner is probably going to work itself out of a job, because now savvy people will not use Corporate sources for making their self-serving changes. Of course, WikiScanner will still continue to uncover the clueless... but if anybody in business is smart at all, its popularity is already making it less useful.
It's not like it's especially hard to drill down to real sources from most WP articles.
The hell it isn't. The average stuff the average school project is based on would be nearly impossible to find the original sources.
School libraries are small, most of them aren't even interconnected. And even the public library system, which is interconnected, is slow. I recall trying to find the sources listed once in a britannica article in school so that I didn't have to cite britannica - (note it wasn't that i didn't want to cite briticannica, but rather that I was required to cite 10 different sources as requirement of the assignment).
I couldn't find a single one, anywhere. Zilch in the school library. And the public library didn't fare much better. Only one book was in the province, and it would have taken weeks to get through the inter-library system. The only place I could find the papers was if I wanted to pay.
Drilling down to the cite sources ought to be its own assignment. Its harder than writing the paper.
I actually know someone who is a lecturer, and when he sets papers, goes on the relevant Wikipedia entries and inserts misinformation. Then, when this nonsense crops up in papers, marks them down in a "haha pwnt" sort of way. He says it's a way of teaching people not to rely on such sources.
Of course, this raises questions of ethics. He's sabotaging a source of information in order to "teach a lesson" to his students. Wouldn't his time be better spent improving said source of information? isn't that his job, after all?
How dare you be so modest!! You conceited bastard!!
I'm not really sure what you're getting at.
Finding primary sources in print is hard and extremely time-consuming, and requires access to a big library. Completely agree. It's totally beyond the scope of most students in public primary and secondary schools, and probably most college students who aren't at a big university.
However, this is where Wikipedia is better than Britannica. In a Britannica article, you usually get a few print sources as references. In a Wikipedia article, you usually get a ton of references, and many of them are electronic (and if it's a recent event, many of them are both electronic and primary sources, e.g. links to news sources).
Take, for example, the WP article on George Washington. It has 49 direct citations, most of which are to sources that are available both freely and electronically. And many of those are to well-respected institutions that you could cite directly (the LOC, the Smithsonian, etc.). And beyond that, there's a separate list of suggested reading, which includes electronic versions of George's actual writings, a short biography published in the NY Times, and a collection of primary-source material related to slavery in Philadelphia by the Independence Hall Association. In five minutes, starting with the WP article, I turned up more primary and citable secondary sources than I probably could have found in an afternoon's worth of searching in a good library.
That, to me, is the real strength of Wikipedia. Regardless of its strengths or weaknesses as a source itself, it is an excellent portal to a vast quantity of electronic information, available to anyone with an Internet connection. While a student forced to use nothing but paper sources is hobbled by the size of their school's library -- which is almost always directly proportional to the wealth of the area they live in and the importance that the community places on education -- allowing student to use (good) electronic sources narrows the gap considerably, provided both have access to the Internet to begin with.
"Ladies and gentlemen, my killbot features Lotus Notes and a machine gun. It is the finest available."