Wikipedia's Accuracy Compared to Britannica

← Back to Stories (view on slashdot.org)

Wikipedia's Accuracy Compared to Britannica

Posted by ryuzaki0 on Thursday December 15, 2005 @02:36AM from the elementary-my-dear-data dept.

Raul654 writes "Nature magazine recently conducted a head-to-head competition between Wikipedia and Britannica, having experts compare 42 science-related articles. The result was that Wikipedia had about 4 errors per article, while Britannica had about 3. However, a pair of endevouring Wikipedians dug a little deeper and discovered that the Wikipedia articles in the sample were, on average, 2.6 times longer than Britannica's - meaning Wikipedia has an error rate far less than Britannica's." Interesting, considering some past claims. Story available on the BBC as well.

17 of 418 comments (clear)

Min score:

Reason:

Sort:

Careful with stats... by erick99 · 2005-12-15 02:42 · Score: 5, Insightful

I am not sure that it is reasonable to consider error rate primarily as errors per unit of text. In that case, one could write a submission and then insert a lot of fluff to lower the "error rate." I would consider the absolute amount of errors per submission at least as important as the quantity of errors as a function of quantity of text. Just a thought.

--
http://www.busyweather.com/
Versatility by soulsteal · 2005-12-15 02:42 · Score: 5, Insightful

Sure they found errors in Wikipedia and Britannica, but which one can you go back to and correct?

Game, set, match!
Informative by drewzhrodague · 2005-12-15 02:43 · Score: 4, Insightful

I find Wikipedia quite informative, and easy to get to. I don't see what the problem is, or why those people want to class-action Wikipedia. I've learned a bunch of things by browsing, and investigating things mentioned in the articles. Even if Wikipedia were a little bit innacurate, it would certainly beat out my first 8 years of education, where I've found almost all of the science I've learned is actually wrong (by talking to scientists, and reading books, and wikipedia).

--
Zhrodague.net - I do projects and stuff too.
Longer article... by everphilski · 2005-12-15 02:47 · Score: 4, Insightful

... doesn't mean a better article. Encyclopedias are meant to be concise and to the point. A starting point for research, not a be-all and end-all. And I don't agree with normalizing errors to the length of the article, it should be the number of errors per article. Just because you wrote more stuff it doesn't give you the leeway to screw up more...
Re:More words == lower error rate? by Phreakiture · 2005-12-15 02:50 · Score: 4, Insightful

So if I go to Wikipedia and type the word "gibblefinch" a few thousand times into an article, I can reduce its error rate?

Only if that is what the article should say, and saying so is useful to someone looking up whatever topic it is you are looking up and finding the aforementioned gibblefinch storm. If, on the other hand, it is not useful or relevant, then not, it would tend to increase the error rate, or at lease lower the signal to noise ratio, rather greatly.

--
www.wavefront-av.com
Re:Not exactly by irote · 2005-12-15 02:56 · Score: 4, Insightful

And it's also nonsense. The Wikipedia article is written flabbily, by a collection of authors, some experts, some not, some good writers, some terrible ones.

The Britannica, on the other hand, is written by someone with clear credentials as an expert, to a word limit, and is then edited for conciseness and clarity. That is to say, the Britannica piece will undoubtedly say more than the Wikipedia piece. The error per word rate in Britannica may be higher, but the error per fact rate is probably much more favourable to Britannica.

Easy example - compare the writing in a mainstream newspaper to a well-written one with tight editorial policies, like the Financial Times or the Economist. Your average Sidney Morning Herald, Guardian or San Francisco Chroncile article is probably longer, but it says less.
Can't reference Wikipedia because it changes by nincehelser · 2005-12-15 02:59 · Score: 5, Insightful

Wikipedia seems fine for informal use, but how can you possible cite sources with something that is constantly changing?
How are they quantifying "error"? by kalidasa · 2005-12-15 03:08 · Score: 5, Insightful

If the Britannica article misspells 2 words, and the Wikipedia article is based upon an assumption that light travels through the medium of ether, does that mean that Wikipedia has half as many errors as Britannica? This is a lot more complicated than the kind of statistical error analysis these folks are trying for.
Re:Another thing by laughingcoyote · 2005-12-15 03:11 · Score: 4, Insightful

That's all just made up shit, dude. Why would you want that in an encyclopedia??

While I don't have a set of Brittanicas right here, I would guess that you can find references in Brittanica to the plays of Shakespeare, Aphrodite, Zeus, Thor, and The Odyssey.

All of that is "made up shit", but a culture's fiction and mythology is still relevant to a discussion of the culture in question. So why shouldn't Wikipedia, with its quicker-changing nature, have information on more modern fiction and myth?

--
To fight the war on terror, stop being afraid.
Can't we all just get along? by typical · 2005-12-15 03:33 · Score: 5, Insightful

Other than as a willy-waving metric, it seems that the error count in a tiny sampling of articles isn't useful at *all*.

I mean, it's pretty clear that both Britannica and Wikipedia are useful references. They have different strengths and weaknesses, but neither is gong to be unilaterally better.

Now, I personally use WP exclusively; It's available from anywhere with a web browser, it's free, it covers the sorts of things that I deal with frequently (tech, pop culture, people) and I'm a fan of the open source mentality. For my particular needs, WP is better suited. However, I don't see a need to claim that one is *better*. There are going to be WP articles that are *chock full* of errors on some points or link to sketchy sources, and there are going to be Britannica articles that just don't exist compared to WP or are simply outdated. It doesn't take people very long to figure out which is more appropriate to their uses, because aside from the initially surprising fact (to me, at least) that WP works and doesn't simply fall prey to vandalism, the strengths of the two aren't that hard to figure out. I'm not going to use WP as a primary source for a research paper, but it's going to be the very first reference that I turn to when I want an overview of a topic.

I think that WP still has some challenges to pass -- WP contains articles on specific *products*, which Britannica completely lacks, and at some point, marketers are going to start expressing interest in the ability to freely edit Wikipedia articles on their products. But people that claim that WP is not useful are so clearly demonstrated wrong by a short while of using WP that there isn't any point in even arguing the point. It would be like someone claiming that Google isn't useful because it can return results to pages that aren't peer-reviewed.

Right now, there's a lot of noise over the Seigenthaler incident, but that's a tiny ripple in a vast ocean -- people will find a way to solve problems like this (if not in WP, then in a competing, derived system), just because it's so useful to do so. Reputation systems, a second system that blocks admission of changes until someone reviews them, whatever. We haven't even scratched the surface of systems like this, and their value is clearly phenomenal. I have read far more history and computer science on WP than I've been motived to read about elsewhere for quite some time. I've looked up a number of things that I always wondered about (what "grunge" actually *is*, for example), because WP is so quick to access, so vast, and so readable.

The best thing about all this is that WP is something that nobody (or very few people, at least) were making noise about until recently. The Internet solves problems (communication, latency, ability to provide links to other content, ease of collaboration, access to everyone to try out new system ideas) that allow incredible new systems that have never existed before in humanity's existence, and the number of new (as of yet raw perhaps, unpolished) systems is *exploding*. Search engines are the only thing that was an immediate and obvious application to me when the Web came into being, and even the mechanisms of something like Google were certainly not obvious. In the past few years, we have seen ideas like del.icio.us, yahoo's bundle of services, free webmail, Wikipedia, and so forth come into being. What's even more incredible is that these things are *enabling* technologies. Each one is a tool that allows people to more easily communicate or deal with things, which makes us even *more* powerful and makes it even easier for us to make new tools. If I can freely collaborate without long-distance phone charges with people in Sweden, I expand the number of people that I can share knowledge with. If I can read, at least in a rudimentary fashion, the languages that I can read through use of Babelfish, I have hugely increased the number of documents available to me. If I can take advantage

--
Any program relying on (nontrivial) preemptive multithreading will be buggy.
Re:Not exactly by irote · 2005-12-15 03:37 · Score: 5, Insightful

What's the content unit? The fact or the word?

As you say, the quality of writing is not what's being examined. We turn to an encyclopedia, whether printed or online, for facts.

For this reason, it's the accuracy of these facts that is of interest to us.

Accept the (indubitably true) proposition that the fact-to-word ratio in Britannica is higher than in Wikipedia, then the submitter's 'argument' is false: dividing the length of an article by the number of errors in it does not give you an average error rate.

A word is neither true nor false, a statement can be.
Participation by shaitand · 2005-12-15 03:47 · Score: 5, Insightful

Did the experts correct the errors? I hope so.
Re:Entries the same length...or not? by geoffspear · 2005-12-15 04:08 · Score: 4, Insightful

Yes, but you're assuming that the rate of errors per article remains constant when the lengths of the articles vary.
Even if you ignore the obvious bias of the people (identified as "Wikipedians") refuting the Nature study, you have to admit their methodology is flawed. If the original study properly controlled for the lngth of articles, you can't refute it by showing that articles they didn't study might vary in length.

--
Don't blame me; I'm never given mod points.
Re:Not exactly by iabervon · 2005-12-15 04:25 · Score: 4, Insightful

That is to say, the Britannica piece will undoubtedly say more than the Wikipedia piece.

That's not actually true. Wikipedia's threshold for relevance is lower, so the articles say more, in addition to being less densely written. This is due, to a large extent, because Britannica has to print theirs, so they have pressure to keep things brief, whereas Wikipedia can go into lots of detail. I don't have access to Britannica, but I'm willing to bet that it doesn't explain the Reed-Solomon configuration for error correction on CDs. So chances as that Wikipedia articles have more information in them, although not by as big a factor as the increase in size. Of course, there's no way for us to know at this point the characteristics of the articles that Nature used for this comparison, because they seem to have merged related articles in both cases. For example, most of the content of the Wikipedia "Field Effect Transistor" is in the articles on particular types (MOSFET, JFET, etc.), and the article on Woodward in Britannica must have gotten sections from other articles (e.g., overviews of things he worked on) pulled in if Nature compared versions of remotely similar lengths or scope, since Britannica doesn't break up this topic into articles the same way.
Why just science articles? by ThinkFr33ly · 2005-12-15 04:25 · Score: 4, Insightful

Seems to me that science articles might not be the place category of articles to use to judge the accuracy of Wikipedia. I suspect that most people contributing to the science articles have a pretty good knowledge of the subjects in question... they're not things that most people know a lot about. Acheulean industry? Kinetic isotope effect? Meliaceae? Huh?

Where I suspect more errors abound in wikipedia is in the articles about things that a lot of people think they know a lot about, but in fact don't have any idea what they're talking about. Or topics in which people have a vested interest in misinforming people. (Political topics, for example.)

Honestly, a better comparison would have been a sampling of 100 or so randomly selected entries. Confining it to just science articles seems like an attempt to misrepresent the accuracy of wikipedia.
Re:Dooop by ceejayoz · 2005-12-15 04:29 · Score: 4, Insightful

Here this was up just yesterday and was just taken takendown.

So you left slander up on the Internet when you could easily have removed it? You're part of the problem!

Without Wiki it WOULD NOT HAVE BEEN UP AT ALL.

And neither would much of the useful content.

Other Encyclopedias don't have problems, anywhere even remotely close to Wiki with its slander and information athentication WARS.

Other encyclopedias don't have much of the more obscure information available in Wikipedia.
Re:Not exactly by Haeleth · 2005-12-15 05:26 · Score: 5, Insightful
Accept the (indubitably true) proposition

Your use of language is as careless as that you attribute to Wikipedia's editors. No proposition is "indubitably true", and no proposition can be proven by asserting its truth without providing any sort of argument to support the assertion.

It is plausible that Britannica presents facts more concisely. It is even likely. But unless someone actually
- Defines a "fact", in the context of an encyclopedia article, in an objective and measurable way;
- Devises a methodology for assessing the ratio of facts (thus defined) to words;
- Applies this methodology to a statistically significant selection of articles from Wikipedia;
- Applies the same methodology to a comparable set of articles from Britannica; and
- Publishes their definitions, methodology, and results,
then you simply can not describe the proposition as "true". And even if such a study existed, you would have to be pretty damn sure that its methodology was unassailable before you could consider describing the proposition it supported as "indubitably true".