Wikipedia and Plagiarism

But by JesseBikman · 2006-11-05 03:30 · Score: 1

wikipedia is free.

Re:But by The+MAZZTer · 2006-11-05 03:34 · Score: 1

So are my term papers.
Re:But by JesseBikman · 2006-11-05 04:05 · Score: 1

but if shit is knowledge, wouldn't it be good to know your shit?
Re:But by albertost · 2006-11-05 04:37 · Score: 1

not shit, but cancer, Mr Ballmer
Re:But by Dunbal · 2006-11-05 05:19 · Score: 1

but if shit is knowledge,

Then tubgirl is a smart woman...

--
Seven puppies were harmed during the making of this post.
Re:But by Tim+C · 2006-11-05 05:24 · Score: 1

But copyright infringement isn't; just being non-commercial won't necessarily save it, if infringement is indeed taking place.

--
It's official. Most of you are morons.
Re:But by LindseyJ · 2006-11-05 05:26 · Score: 1

No they aren't. You're paying good money to write those term papers.
Re:But by tepples · 2006-11-05 05:31 · Score: 1

So are my term papers

This appears to be a joke, but what are you trying to imply? Is it that you reused Wikipedia's expression and made your own papers Free to comply with the Wikipedia copyleft?
Re:But by JesseBikman · 2006-11-05 12:34 · Score: 1

Actually, thats orange juice... or so the wikipedia article for tubgirl states.

That doesn't seem like alot by NinjaFarmer · 2006-11-05 03:33 · Score: 2, Insightful

Doesn't Wikipedia have over a million articles (not in English alone, I know)? That would mean that's less than .1% of the articles are plagiarized. Seems reasonable to me that that amount would get by into unnoticed. All it takes is for the original author then to deal with it.

Re:That doesn't seem like alot by sprins · 2006-11-05 03:36 · Score: 2, Insightful

Apparently Wikipedia has over 1.5 million english articles alone. So your calculation of the percentage of 'problematic' articles is even more favourable. Of those 142 eledgedly 'problematic' articles only a few really seem to be a problem as the others originated from the public domain to begin with.

Sounds like much ado about nothing once more. *yawn*
Re:That doesn't seem like alot by nomadic · 2006-11-05 03:40 · Score: 1

Except the story specifically says he checked only about 12,000 of wikipedia's articles, so that would make it about 1% are plagiarized if you extrapolated. Still not horrible, but I'm guessing it's a lot higher than Brittanica.
Re:That doesn't seem like alot by aquaepulse · 2006-11-05 03:42 · Score: 4, Insightful

Well that 142 was found out of his search of 12000, if his methodology was sound you could expect the proportion plagiarized within the 1.5 million to be about 17750. About 1.18%.
Re:That doesn't seem like alot by Yvanhoe · 2006-11-05 03:52 · Score: 1

Right now we can just watch and see how this story end. I doubt this automated procedure could take into account contributor copying their own copyrighted materials insode wikipedia. I think this has already happened, I don't say that 100% of the 142 articles are in this case, but I think he raises an interesting point. Let's now see how this ends

--
The Wise adapts himself to the world. The Fool adapts the world to himself. Therefore, all progress depends on the Fool.
Re:That doesn't seem like alot by tomhudson · 2006-11-05 04:00 · Score: 2, Informative

... and after an investigation of some of those by Wikipedia, it was found that some were in the public domain, some were culled from government sites, and some were copied from the wiki, and not the other way around. Of those 12,000, we can now say that the wiki is at least as clean as Ivory soap (99.44%).
Re:That doesn't seem like alot by sbaker · 2006-11-05 04:13 · Score: 1

Some are also instances of people writing something on their own web site and then later deciding to put it on Wikipedia - so even the instances where the other web site predates the Wiki article may not be copyright violations. Without discussing the matter with every single original author, it's hard to know.

I guess the only thing this study tells us is that an UPPER limit on the number of plagiarisms is of the order of 1%. That's still an alarmingly high number.

--
www.sjbaker.org
Re:That doesn't seem like alot by tomhudson · 2006-11-05 04:29 · Score: 2, Insightful

Considering that an audit of dead-tree encyclopedias hasn't been done, we can't say. What we CAN say is that its foolish to make a comparison with Britannica, when an audit of Britannica found 10% of 600 articles to be non-factual. The sources cited in those 10% disavowed the articles' contents.
This isn't all that surprising either, when you think about it. People cite people who cite people, and someone somewhere will mis-interpret what someone else wrote, or come to different conclusions while still citing the original author.
Re:That doesn't seem like alot by kkwst2 · 2006-11-05 04:39 · Score: 2, Interesting

Alarmingly high? You find it alarming that 1 of every 100 articles on a free web-based encyclopedia has plagiarized material. You are clearly much less cynical than I am. I would have guessed at least 5%, probably more.
Re:That doesn't seem like alot by DragonWriter · 2006-11-05 04:49 · Score: 1

Except the story specifically says he checked only about 12,000 of wikipedia's articles, so that would make it about 1% are plagiarized if you extrapolated. Which would make sense to do if it was a systematic random sample, rather than a selection conducted by someone who has been on an anti-Wikipedia crusade for quite some time, as this one is. Of course, there is the question of the trustworthiness of the original number, as well, as the material was never independently reviewed, and Wikipedia's own reviews (as TFA notes) found some cases that Brandt did not eliminate where the other site appears to have copied Wikipedia rather than the other way around.
Re:That doesn't seem like alot by user24 · 2006-11-05 05:26 · Score: 5, Funny

"It's a wiki. If you find a problem with it, you fix it."
no, it's a wiki. If you find a problem with it, you add a template telling everyone that someone else should fix it.
Re:That doesn't seem like alot by sbaker · 2006-11-05 05:31 · Score: 1

If it were (as many people assume) just a case of people sitting down and writing articles which remain in that state for all time - then, yes, I'd guess closer to 5% too. But that's now how Wikipedia works. Pick an article at random - hit the history button - see how many people have worked on it? For plagiariasm to stand, it requires that none of the subsequent editors noticed it. That's much less likely - but still possible - but in addition to that, the general churning up of text tends to change sentences and paragraphs around until they bear no resemblance to the form they came in as...this would 'unplagiarize' text fairly quickly in most cases.

So - yeah - I'm a little surprised it's as high as 1% - and probably it's not.

--
www.sjbaker.org
Re:That doesn't seem like alot by TheCarp · 2006-11-05 05:43 · Score: 1

wow that made me laugh so hard, I am practically crying.

--
"I opened my eyes, and everything went dark again"
Re:That doesn't seem like alot by CuriosityKilledWHAT · 2006-11-05 06:46 · Score: 1

It's really not a lot. Compare to:
Plagiarism by Adult Learners Online: A case study in detection and remediation
Or look here.
Re:That doesn't seem like alot by Monsuco · 2006-11-05 06:52 · Score: 1

All it takes is for the original author then to deal with it.
And come to think of it, couldn't this guy have just fixed it rather than whined about it? It is a wiki after all.

--
The Gospel according to lolcat
Re:That doesn't seem like alot by DJCacophony · 2006-11-05 07:16 · Score: 1

You don't need to know the subject well enough to write authoritatively on it. You just need sources that do.

--
Slow Down, Cowboy! It's been 60 minutes since you last successfully posted a comment.
Re:That doesn't seem like alot by Magic5Ball · 2006-11-05 07:57 · Score: 1

That would be 1% plagiarized from things Google indexes. There are many paper books and journal articles, and other on-line sources which aren't in Google but which can be just as easily copied.

--
There are 1.1... kinds of people.
Re:That doesn't seem like alot by mj_sklar · 2006-11-05 08:35 · Score: 1

Well, sir, if you knew the subject so well, would it hurt you to share your knowldedge? You wouldn't need the wiki, but you could be using the wiki to inform others.

--
The wii is the revolution, comrade! ...use the fucking wiimote or I'll gut you like a fish!!!
Re:That doesn't seem like alot by cheater512 · 2006-11-05 08:39 · Score: 1

Why was it only out of 12,000? The raw database dumps are avaliable.
Wouldnt it be easier to do the lot and save having to do it again?
Re:That doesn't seem like alot by GreatBunzinni · 2006-11-05 08:52 · Score: 1

Actually the Wikipedia procedure for weeding out the copyrighted work is to flag the article as a possible copyright violation (add a {{copyvio}} template to the article) along with the source and then inform the editors about that problem by adding the article into the list of articles with possible copyright violations.

Regarding the article, there is already a very active community weeding out Wikipedia of possible copyright violations. I don't know how this can be considered news.

--
Slashdot, fix your code or at least hire someone who is competent at it to do it for you.
Re:That doesn't seem like alot by damiangerous · 2006-11-05 08:58 · Score: 1

That misses the point of the AC parent. I could be updating it if I knew the subject, yes. But what if I didn't know the subject? How would I know it's wrong? I either have to be researching the subject anyway, in which case I wouldn't be using information from Wikipedia or I would have to just have a passing interest in the subject without being particularly concerned if it's wrong.
Tycho from Penny Arcade said it best, and this is a point that has never been addressed.
Re:That doesn't seem like alot by kalidasa · 2006-11-05 09:51 · Score: 1

Something copied from public domain sites or government sites without attribution is still plagiarised. Plagiarism and copyright infringement are entirely distinct phenomena; just because something is in the public domain does not mean that it cannot be plagiarised.
Re:That doesn't seem like alot by Achromatic1978 · 2006-11-05 10:57 · Score: 1

What we CAN say is that its foolish to make a comparison with Britannica, when an audit of Britannica found 10% of 600 articles to be non-factual. The sources cited in those 10% disavowed the articles' contents.

Speaking of a citation, I'd love to see a citation for a claim that 10% of articles in what is commonly regarded as the world's most authoritative encyclopedia are "non-factual".
Re:That doesn't seem like alot by Achromatic1978 · 2006-11-05 11:06 · Score: 1

Actually, they're not, really. They've been broken for a while now: http://download.wikimedia.org/enwiki/20061014/ That being said, last I read the databases were approaching, if not exceeding a terabyte in size. Would be a far more complex project to test for plagiarism against a terabyte of corpus.
Re:That doesn't seem like alot by user24 · 2006-11-05 11:14 · Score: 1

I'm glad it was interpreted as 'funny' not 'troll'.
you may be interested in my wiki ideas at [[User:user24]]
Re:That doesn't seem like alot by doom · 2006-11-05 11:51 · Score: 1

Pick an article at random - hit the history button - see how many people have worked on it? For plagiariasm to stand, it requires that none of the subsequent editors noticed it.

More to the point, for a plagarized sentence to stand, it would have to avoid getting re-written to insert a bit of Simpsons trivia in the middle of it.
The odds of that happening are very low.
Re:That doesn't seem like alot by tomhudson · 2006-11-05 12:02 · Score: 1

Its hard to give attribution for stuff in the public domain - a lot of times there IS no known original author. But the most prolific author ever in human history is some guy named Anonymous.
Re:That doesn't seem like alot by dbrutus · 2006-11-05 12:05 · Score: 1

If something may be legally copied, why is copying it a bad thing? One would think expending the resources to needlessly reinvent the wheel would be considered the bad thing, not copying public-domain IP.
Re:That doesn't seem like alot by tomhudson · 2006-11-05 12:09 · Score: 1

And I'd like to see a citation that Britannica is "the worlds' most authoritative encyclopedia". For the article I'm talking about, you'd have to go into the back stacks of the Montreal Gazette, circa the late '70's.
Now if you want to have some fun, go and read one of those 1970's editions of Britannica. They contain a multitude of howlers that even a high school student knows aren't true. Facts aren't supposed to be mutable.
Re:That doesn't seem like alot by tomhudson · 2006-11-05 12:13 · Score: 1

BTW - one study showed that Britannica averages 3 mistakes per article, and Wikipedia 4.
http://media.www.thejusticeonline.com/media/storag e/paper573/news/2006/10/31/Columnists/Kate-Milleri ck.Can.Wikipedia.Really.Be.All.That.Inaccurate-241 0891.shtml?sourcedomain=www.thejusticeonline.com&M IIHost=media.collegepublisher.com
Surprisingly, compared to other online encyclopedias, Wikipedia more than holds its own. The well-known and respected Encyclopedia Britannica averages three mistakes per topic in comparison to Wikipedia's four mistakes on the same topics, according to The Chronicle.

If you want to say the Wiki is crap, so is the britannica. But at least the Wikis are updated (sometimes within the hour of someone planting a false story) as you'll see from the links.
Re:That doesn't seem like alot by sbaker · 2006-11-05 12:30 · Score: 1

I'd love to see a citation for a claim that 10% of articles in what is commonly regarded as the world's most authoritative encyclopedia are "non-factual".

Your wish is my command!

I guess "non-factual" seems a bit strong - I would take that to mean that the entire article had to be totally bogus. I don't think anyone is claiming that. But for a weaker definition of "non-factual": "An article containing at least one major non-fact", the figures are indeed about 10% for both Britannica AND Wikipedia.

The 'Nature' survey is the one that's generally quoted. They took 42 science-related articles from both Wikipedia and Britannica and asked a bunch of highly regarded experts in the field to carefully fact-check each article. They found FOUR significant factual errors within those 42 articles in each encyclopedia. That's a hair under 10% for those of us who are counting...but close enough. In addition to those four 'significant' errors, there were also 162 minor errors in Wikipedia and 123 minor errors in Britannica.

I don't have a link to the Nature article - but here is a story about the story:

http://www.nature.com/news/2005/051212/full/438900 a.html

I understand that subsequent surveys have shown that the minor error rate in Wikipedia has improved significantly (as you might expect for an encyclopedia that's updated so often) - and that by some measures at least, Wikipedia is now more reliable than Britannica. With Wikipedias error rate fast improving over Britannica - and with breadth of coverage VASTLY outstripping Britannica - we have to stop calling Britannica "the world's most authoritative" and hand that title to the Wikipedia.

--
www.sjbaker.org
Re:That doesn't seem like alot by NichG · 2006-11-05 12:38 · Score: 1

Copying it isn't itself the problem; the problem is copying without attribution. If something is copied without attribution then the work may be credited to the wrong person. It also means that if there were modifications to the work the original cannot be traced for comparison, and if the particular thing copied is only an excerpt it isn't possible to find the context from which that was excerpted without the proper attribution.
Re:That doesn't seem like alot by Paradise+Pete · 2006-11-05 12:40 · Score: 1

there is already a very active community weeding out Wikipedia of possible copyright violations. I don't know how this can be considered news.
Plagiarism has nothing to do with copyright violation. If I falsely claim something as my own writing it is plagiarism. The copyright status of the source material is irrelevant.
Re:That doesn't seem like alot by rtb61 · 2006-11-05 17:48 · Score: 1

How can the work be credited to the wrong person, when none of the work is credited to anybody at all. Wikipedia is about the community and the content, not personal bragging rights on what piece of content some individual tries to claim as their own sole original thought though out perpetuity, billions of years into the past and billions of years into the future.
Wikipedia does not claim to be the best encyclopedia, this is what it claims, http://en.wikipedia.org/wiki/Wikipedia:What_Wikipe dia_is_not as for all that pointless, referencing, bibliographies etc. http://en.wikipedia.org/wiki/Anal_retentive. It is about content, community and sharing of knowledge. Universities etc. are quite welcome to ignore wikipedia and critique it, they can also create their own version (we know it won't happen because they will all nitpick it to death before it can get off the ground, much like the gulls in 'Finding Nemo' the content will be the last thing they care about).
For the majority it is just about the content, the rest of have every university, every library, every other commercial encyclopedia web site, you can't own and control everything.

--
Chaos - everything, everywhere, everywhen
Re:That doesn't seem like alot by DuranDuran · 2006-11-05 18:09 · Score: 1

> If you find a problem with it, you fix it.

Nah, I don't. I used to edit Wikipedia, with the lofty ideals of "giving something back". Rarely major substantive edits, mind: mostly corrections to sentence structure and punctuation.

I slowly got sick of having to argue with obstinate editors who simply didn't want to change the article they'd originally written. Eventually, I realised the power that could be held by a small but determined group: the clincher for me was two other editors who kept altering the same article to suit their non-NPOV, uncited perspective. When I filed for protection, the admin protected the page with the vandalism in place (he had done this with several other pages) and then decided to quit Wikipedia!

I deleted my account. Now when I see vandalism and erroneous text, I just sigh and move on.

--
"You can justify anything by putting it in quotes, adding a famous name and making it a sig" - Albert Einstein
Re:That doesn't seem like alot by mdwh2 · 2006-11-06 02:30 · Score: 1

Plagiarism has nothing to do with copyright violation. If I falsely claim something as my own writing it is plagiarism.

Agreed - in which case this article seems completely unfounded, in that no one is claiming credit for the Wikipedia articles.
Re:That doesn't seem like alot by kalidasa · 2006-11-06 03:34 · Score: 1

You can attribute by title and first publication date, with the author "unknown" or "anonymous." In the case of a government document, "Publication 4516, US Printing Office, 2006" might be appropriate.
Re:That doesn't seem like alot by TheCarp · 2006-11-06 05:18 · Score: 1

Yes, your comments are interesting and I would like to subscribe to your newsletter!

Seriously though...its one of those things thats soo funny because its true. However, thats also why its so sad.

I am not even terribly involved in any wikis... I just know human nature, and yes, people generally would much rather complain and raise flags than fix problems. I know because.. well.... me too motherfuckers!

-Steve

--
"I opened my eyes, and everything went dark again"

How is this news? by JanusFury · 2006-11-05 03:35 · Score: 1

Really, how is this news? I don't get it.

--
using namespace slashdot;
troll::post();

Re:How is this news? by Klaidas · 2006-11-05 03:37 · Score: 1

You must be new here...
Re:How is this news? by LeRaldo · 2006-11-05 03:53 · Score: 1

I bet that's news to him.
Re:How is this news? by pasamio · 2006-11-05 04:59 · Score: 1

especially since his user id is half of the child, but you'd expect your parents to be older than you in some cases

--
I always wondered where this setting was...

Impressive by Solder+Fumes · 2006-11-05 03:36 · Score: 3, Interesting

Wow. Only 142 articles in which average Joe Wiki forgot the proper way to attribute a source. I'm actually amazed there were so few occurrences. This article has the effect of heightening my opinion of Wikipedia's quality.

Re:Impressive by porkThreeWays · 2006-11-05 03:52 · Score: 1

In high school while doing term papers at least 1/3 of most of my papers weren't written by me. They were quotes from other sources. What's the difference? It's only plagiarized if you don't cite the source properly. Legally you are allowed to take small quotes and use them in a publication as long as you cite sources. I'm guessing many of those offenders could go legit just by citing the source alone without removing the quote.

--
If an officer ever threatens to taze you, say you have a pacemaker.
Re:Impressive by Salmar · 2006-11-05 04:28 · Score: 1

Please read more carefully. That 142 was the number of articles found in the stated sample of 12000 articles.

--
This is not the signature you're looking for.
Re:Impressive by multisync · 2006-11-05 04:46 · Score: 1

This article has the effect of heightening my opinion of Wikipedia's quality.

I agree. Plagiarism is a reality that all publications have to deal with, and Wikipedia's responose seems to be a reasonable one. They have removed the questionable content pending review.

A while back one of my local papers became aware that a columnist was copying material from aonther paper. They fired her and printed an apology to their readers and the publication she stole from, and moved on.

This Daniel Brandt apparently has an axe to grind against Wikipedia because he was unhappy with an article that was written about him. Among other things, he feels people who write and edit articles should make their identites publc, basically so he can sue them if he doesn't like what they write. Reminds me of something else I read here recently.

--
I don't care why you're posting AC
Re:Impressive by Penguinoflight · 2006-11-05 06:16 · Score: 1

First, there's a ton of information which is "common knowlege"; This means that plagarism doesn't apply. Second, unless someone makes a direct quote of something they read, it wont show up as plagarism even if it is. The 142 count just means that all of them were flagrantly plagarized. This still seems rather low, but it makes a little more sense.

--
"And we have seen and do testify that the Father sent the Son to be the Savior of the World"
1 John 4:14
Re:Impressive by Achromatic1978 · 2006-11-05 11:17 · Score: 1

he feels people who write and edit articles should make their identites publc, basically so he can sue them if he doesn't like what they write.

Let's leave aside "doesn't like what they write", that's your subjective description. But I'm curious, what exactly is your argument for anonymity and freedom to defame and libel someone? (On the flipside, nor am I implying that is the case for Daniel Brandt, either.)
Re:Impressive by Solder+Fumes · 2006-11-05 17:16 · Score: 1

Did I say "only 142 articles in all of Wikipedia"? No. Please read more carefully.
Re:Impressive by multisync · 2006-11-06 15:10 · Score: 1

Let's leave aside "doesn't like what they write", that's your subjective description.

No it's not. It's hard to imagine someone accusing Wikipedia of libeling them over an article he did like.

what exactly is your argument for anonymity and freedom to defame and libel someone?

Defaming someone by, say, using an out-of-context photo of them on your "anti-Wikipedia" site to sway public opinion. (I wonder if he has the permission of the rights holder of that photo to publish it on his site.)

Ok, maybe that's a stretch. Do you have an argument against privacy and freedom of speech?

I don't recall arguing in favour of defaming and libeling people; I took issue with his stance that Wikipedia should be held to a higher standard and, therefore, editors and contributors must relinquish their right to anonymity. If someone libels or defames you, it makes no difference whether they do it on Wikipedia, or posting anonymously on Slashdot. If you feel you've been libeled, and suffered some tangible damage, plead your case to a judge and ask him to subpoena the site host for their logs.

Or would you rather have to provide proof of identity every time you post a comment on Slashdot?

--
I don't care why you're posting AC

Not shocking, but not a big deal by Chairboy · 2006-11-05 03:36 · Score: 2, Interesting

What's missing from the summary is that almost immediately upon getting the list, the articles in question were dealt with and the offenders were blocked or warned.

Wikipedia is written by a large community, and people make mistakes. I have read about other reference tomes that have been caught plagiarizing (for example, some encyclopedias or atlas's will put in a fake piece of data or a fake street so that they can easily determine if they're being copied from), and the turnaround time for fixing it can be years depending on the publishing cycle.

This isn't a condemndation of Wikipedia, despite Mr. Brandt's best efforts, it's a confirmation of why WP works.

Re:Not shocking, but not a big deal by crossmr · 2006-11-05 13:39 · Score: 1

Of course its missing. Slashdot cannot reach the Fox News network level of FUD by providing accurate summaries.

Only 142?! by thelamecamel · 2006-11-05 03:36 · Score: 1

142 articles is bugger-all, not all of these cases were actually plagiarism, and the biggest cited example in TFA is "An entire paragraph in Alonzo Clark's entry". Surely there has been much greater, and more significant plagiarism in Wikipedia than this? Why is this number so low?!

Re:Only 142?! by Smallpond · 2006-11-05 06:39 · Score: 1

It's interesting that Wikipedia also removed the edit history so that we can't tell what was there or who contributed it in the first place.
Re:Only 142?! by interiot · 2006-11-05 06:51 · Score: 1
The actual contents of the deleted versions obviously won't be visible, since it's a legal issue. The edit history metadata used to be visible to everyone, until vandals started being "funny", and leaving personal information in edit summaries, so unfortunately the edit history isn't automatically visible now. But admins may cut-n-paste the history on request. Here's that one:
- 07:17, 23 October 2006 Alphachimp (Talk | contribs | block) deleted "Alonzo M. Clark" (g12)
- 21:18, 21 May 2006 . . Siva1979 (Talk | contribs | block) (minor wikification)
- 22:32, 10 March 2006 . . Jack Cox (Talk | contribs | block)
- 19:24, 8 January 2006 . . HollyAm (Talk | contribs | block) (change stub)
- 06:38, 5 December 2005 . . Frank101 (Talk | contribs | block) (sort stub)
- 01:31, 10 November 2005 . . NatusRoma (Talk | contribs | block) (+links; +cats; US-politician-stub)
- 22:02, 9 November 2005 . . DJ Clayworth (Talk | contribs | block)
- 22:01, 9 November 2005 . . DJ Clayworth (Talk | contribs | block) (fmt)
- 21:57, 9 November 2005 . . 138.88.161.238 (Talk | block)

Pfizzle. by Etherwalk · 2006-11-05 03:37 · Score: 1

142 out of 12,000, some of which aren't really a problem, and that's numbers generated by a critic?

Yes, it's a problem, but that's actually not a bad score at all. You probably get more plagiarism than that on college papers at good schools. How many of these articles cite what they "plagiarize," even if they don't put it in quotes? Also, to make it legal plagiarizing, all you have to do is re-write each paragraph in your own words.

I see 1.18% of articles as potentially having text lifted from somewhere else as a serious issue for the maintainers of Wikipedia, sure. But I don't think it has a major negative impact on its reliability, or on the quantity or quality of information contained within it. And reliable information is what I care about when I go to wikipedia. If it worked only by having mass exerpts of other sites, I'd call it "GOOGLE," and I'd still use it.

Re:Pfizzle. by Daniel+Rutter · 2006-11-05 04:12 · Score: 1

142 out of 12,000, some of which aren't really a problem, and that's numbers generated by a critic?

And a very... dedicated critic, too.

I must admit there's a certain recursive appeal to the idea of someone being notable enough for a Wikipedia entry purely because of his vehement attempts to avoid being mentioned on Wikipedia.

As usual, the talk page has lots of entertaining dirt.

(Uncyclopedia has the real low-down, of course.)
Re:Pfizzle. by Etherwalk · 2006-11-05 04:27 · Score: 1

Legal != ethical.

A school's honor code may be very different from a nation's copyright laws. (As they should be.) Ideally, if you come up with an idea in conversation with a few friends around a coffee table, and they contribute meaningfully to the genesis of the idea, you'll cite, thank them, or credit them in the finished product. But from a copyright status, while you can copyright the form of an idea, you can't usually copyright the idea itself--which is why you can write a new horror novel, or a new formulaic fantasy or soap opera. It's also why you can write any work about history. The copyright of the people who wrote the books you used to research (and even if you had primary sources, you almost certainly got information from copyrighted materials as well,) doesn't apply to your work, even if you're conveying the same information.
Re:Pfizzle. by pasamio · 2006-11-05 05:05 · Score: 1

See this is the thing that gets me about academics. Unless you have the relevant sized pole shoved in your preference of orifice and can point to it accordingly you cannot have your own opinion or new idea. It has to be someone elses. It shits me off because I have so many strange ideas that I'm not going to bother looking in case some retard had them before. The conceept that you have to be some brilliant person to have an idea just annoys me. I remember back to studies on ancient history and the development of farming. Around the globe around the same time different independent civilizations developed the concept in varying degrees (some also developed irigation earlier as well due to needs or different methods of irrigation to resolve problems). Do they all need to reference God for showing them how to tend the ground?

--
I always wondered where this setting was...
Re:Pfizzle. by Etherwalk · 2006-11-05 07:15 · Score: 1

Actually, I think his complaint was about students who didn't have an original idea. (Although I suppose it was worded ambiguously.) If you're analyzing a text and come up with an idea on your own, that's fine; but if you come up with the same idea by reading an article about the text, you should cite the article. That's fair.
Re:Pfizzle. by Achromatic1978 · 2006-11-05 11:23 · Score: 1

on college papers at good schools

At "good schools"? Why, are people who go to "good schools" less likely to plagiarise than students who can only get into, say, "mediocre schools"?
Interestingly, in Australia, most first year university dropouts are the product of "good (high) schools", who are far more interested in being able to jack up their fees on the premise that "99% got into university!" to worry about little details like teaching students to think for themselves. Teachers spoonfeed answers to get the best grades, kiddies run off to university and are shocked and dismayed that when the professor says "Your research assignment is..." it's not analogous to "Go to TA / tutor, ask for answers, write up in essay form."
Re:Pfizzle. by Etherwalk · 2006-11-05 11:37 · Score: 1

I don't know--I haven't studied the matter. But my thoughts at the time were about top-tier schools, and I thought I'd generalize a little to "good schools."

Secondary education in many areas in the states is pretty terrible, too. The horror stories I've heard from Hawaii, Philly and New York are... well, terrible. And that's the fault of schools, parents, administrations, teachers, and the children themselves. It's a collective failure; even though some people try to change it.

As to the high school/University shock, it goes a lot of ways, but what you say is certainly true to a degree. There's little as frustrating as when you find professors at college who expect you to regurgitate information or their personal opinions, rather than to analyze or study legitimately or research.

Re:1% plagarism! by Solder+Fumes · 2006-11-05 03:44 · Score: 1

Plagiarizing on Wikipedia has to be one of the more victimless "crimes" I can think of, especially since entries are essentially anonymous and no one else is really getting quantifiable credit for using someone else's text in a wiki article.

The proof of the pudding by GerardM · 2006-11-05 03:45 · Score: 1

The proof of the pudding is in the eating; consider Mr Brandt comes up with a computer generated list of potential problematic articles. These are scrutinized and where needed problematic content is removed. The wiki methodology works thanks to Mr Brandt.

Conclusion; the best way of improving Wikipedia is by showing where it has a problem. Mr Brandt disproved his opinion. Live and learn. :)

Thanks,
GerardM

Re:The proof of the pudding by JourneyExpertApe · 2006-11-05 11:48 · Score: 1

I've seen lots of plagiarism on WP myself. Mostly they were sourced, but they were copied directly from the source and not put in quotes. My usual response is to delete the offending passage and put a note in the discussion with a link to the plagiarism article. Sometimes people have asked me why I didn't paraphrase the content myself. My stock response is, "to make a point." WP won't survive if it's users continue to rip off others' work. I think WP needs to take this more seriously by, for example, providing a link on every page where plagiarism can be reported and imposing a temporary ban on people who do this. WP does NOT need the contribution of people who can't understand plagiarism or refuse to abide by the rules.

--
If you can read this sig, you're too close.

Daniel Brandt, valuable Wikipedia contributor by alienmole · 2006-11-05 03:49 · Score: 4, Insightful

Brandt is doing a great service to Wikipedia — checking for and reporting plagiarism. That takes dedication and hard work. It's ironic that he feels the need to present it as criticims of Wikipedia's model, when in fact he's demonstrating the power of contributions from many people with different motivations. Even if the motivation is anti-Wikipedia, Wikipedia just absorbs the input and grows stronger.

"If you strike me down, I shall become more powerful than you could possibly imagine..." -- Obi Wiki-nobi

Re:Daniel Brandt, valuable Wikipedia contributor by sjwest · 2006-11-05 04:49 · Score: 1

If memory serves me http://en.wikipedia.org/wiki/Charles_Van_Doren didnt cheat either (Game show 21) and he then worked on the Britannica.

Funny thing 'cheating'.
Re:Daniel Brandt, valuable Wikipedia contributor by Pharmboy · 2006-11-05 05:35 · Score: 1

I posted an article on Wikipedia that a copy of a webpage I had written (and own) on another site. It was taken down on Wikipedia within 24 hours until I posted on the FIRST site that the information was under the GPDL, which took me all of 5 minutes, then the article was restored. No harm, no foul, the editor was just taking no chances. I would say they are pretty good at catching potential copyright issues, at least from MY experience.

Besides, 142 out of 1,500,000 articles is only 0.009% of the content, or Wikipedia is 99.991% Infringement Free. Ivory brand soap isn't that pure. Even Everclear is only 98% alcohol.

--
Tequila: It's not just for breakfast anymore!
Re:Daniel Brandt, valuable Wikipedia contributor by alienmole · 2006-11-05 05:51 · Score: 1

Brandt apparently only checked 12,000 articles, so that makes it about 1.2%. It's still pretty good for such an open resource, and it's not clear to me what Brandt's criticism really is — it seems more like a smear attempt. It's not as though Wikipedia is embracing plagiarism and refusing to do anything about it.

From an ex-wikipedia administrator by BMIComp · 2006-11-05 03:51 · Score: 1

I used to be a wikipedia administrator, before resigning due to time constraints. However, we would catch a lot of the copyright issues. I mean, when you're reading an article, and part of its plagerized, it's usually really obvious. The plagarized part usually doesn't fit into the rest of the article.. and you can just tell that the average editor didn't write that copy. (Just as I'm sure a teacher can tell one of his/her students didn't write a plagerized essay) Once you found the possibly infringing content, you could google parts of the suspect text, and see if it appears anywhere else. If it does, you'd either report the problem or remove it yourself.

I used to run into these all the time... but the thing is... a lot of them are caught and removed. Wikipedia has a system to deal with such infrigements, and the users that post them. (See Wikipedia's policy and their copyright problems reporting page) The truth is that you're going to find copyright problems wherever there is user-submitted content (look at YouTub!, for example).

Re:1% plagarism! by Solder+Fumes · 2006-11-05 03:55 · Score: 1

Imitation is the sincerest form of flattery...are you coming on to me?

142 out of 12,000? by MMC+Monster · 2006-11-05 03:55 · Score: 1

142 articles out of 12,000 is certainly a problem, but actually not much of one. I'm sure it he made his script public (I have no idea if he did so. In the /. tradition, I did not RTFArticle) and the wikipedia were to use it, it would be of benefit. Not to automatically tag articles as plagiarism, but at least tag them for further evaluation by an editor.

Buy, hey, 142/12000 is less than 2%. I would have thought the percentage would have been at least 5%.

--
Help! I'm a slashdot refugee.

Re:142 out of 12,000? by AtomicBomb · 2006-11-05 05:58 · Score: 1

I am quite sure the ratio would be the same or even higher if the wiki critic managed to compare published books with existing copyright material. Many so-called experts are no exception to this especially when they are writing "supportive chapters" for their books (e.g. the video hardware technology review for a software research/professor writing a book about OpenGL vs DirectX)....
Re:142 out of 12,000? by John+Hasler · 2006-11-05 06:17 · Score: 1

I wonder what the "plagiarism rate" is in Britannica?

--
Warning: this article may contain humor, sarcasm, parody, and perhaps even irony. Read at your own risk.

Are you going to the prom? by goombah99 · 2006-11-05 04:01 · Score: 1

I ask because apparently You did not actually graduate high school yet if you can't understand what the difference is between cited and uncited text.

--
Some drink at the fountain of knowledge. Others just gargle.

I hope you're not contributing... by NineNine · 2006-11-05 04:03 · Score: 1

...especially to any math articles. 142 is 1.183...% of 12000. Not "less than 0.1%"

Re:1% plagarism! by cddale · 2006-11-05 04:08 · Score: 1

Not true. It is estimated that at least 13% of articles in first-tier journals (NEJM, JAMA, etc.) listed on PubMed contain "ghostauthored" papers--written by drug companies for promotional purposes and where the named authors had little or nothing to do with the study, but were instead paid to front as the authors in order to remove the appearance of bias that would result from drug company authorship, add credibility on the basis of the phony author's reputation, and to promote off-label drug use (that is, for indications beyond what the FDA has approved) which is otherwise illegal. Some of these papers are actually published multiple times. There is a famous example where essentially the same paper was published three times by three different and non-overlapping groups of authors. In one version the sole author's name was even misspelled. The matter was brought to the University of Washington, where on of the authors was on the faculty (and, in fact, the former dean of the medical school, and the University of Washington held that it did not meet the definition of plagiarism (arguing that consent from the original source, which was granted here by the ghostauthor, was a requirement for plagiarism) and did not force a retraction, a printed correction, or even discipline the so-called author of this paper.

Re:1% plagarism! by Zeinfeld · 2006-11-05 04:09 · Score: 1

Any Journal article comprised of 1% plagiarism would be subject to law suits, apologies and the journal would face ostracism.

There is a big difference between plagarised articles and articles with plagarised passages. Pretty much every medium has a significant plagarism rate, including scholarly journals.

The methodology in this case is more than a little suspect. At least 50% of Wikipedia is utter crap. There is fancruft, stubs, POV peddling forks. Anyone who is involved with Wikipedia will admit as much. The fact is that it does not matter if the article on the garage band 'Frog the Bustards' is plagarised or not, only twelve people will read it before it gets deleted, albeit thats five more than have heard the band. The similarity to the official biography is because both were written by the lead singer's girlfriend.

The Britanica comparisons are plain silly. There are 1.5 million articles in Wikipedia of which something like 200,000 could be considered competition to Britanica. OK the Harry Potter pages are interesting and useful but thats not what Britanica claims to provide. That still leaves Britanica in the dust with a mere 100,000 articles.

Fact is that Britanica is not much use on most of the things I need an online source for and equally useful for the things I would use Britanica for. No encyclopedia is 100% trustworthy, the information is inevitably out of date in Britanica. There is no entry at all in Britanica for what I use it most often - tracking the latest computing neologisms.

The most valuable aspect of Wikipedia is precisely the fact that its pages come with 'caveat lector' written on every page. If you read Wikipedia without being aware of possible POV peddling you are an idiot, if you read Britanica without being aware of possible POV peddling you are also an idiot, if you watch Fox News without being aware that it is POV peddling 24 hours a day you are an utter fool.

--
Looking for an Information Security student project suggestion?
Try http://dotcrimeManifesto.com/

US Gov copyright? by julesh · 2006-11-05 04:10 · Score: 1, Insightful

Articles with offending passages have been stripped of most text. An entire paragraph in Alonzo Clark's entry, for instance, was deleted, leaving the article with the bare-bones: "Alonzo M. Clark (August 13, 1868-October 12, 1952) was an American politician who was Governor of Wyoming from 1931 to 1933."

The original article, Brandt said, was copied from a biography on the Wyoming state government site.

Err... I thought works of the US Government were generally free from copyright...?

Re:US Gov copyright? by Microlith · 2006-11-05 04:31 · Score: 1

Citations are still required, even for the work of Government officials.
Re:US Gov copyright? by osu-neko · 2006-11-05 04:43 · Score: 1

Citations are still required, even for the work of Government officials.

Especially so. It's always important to know the source of your information to evaluate potential bias, and particularly when the source has a long track-record of fudging the truth for self-serving purposes.

--
"Convictions are more dangerous enemies of truth than lies."
Re:US Gov copyright? by athmanb · 2006-11-05 04:48 · Score: 1

Only those of the federal government. Those of most states aren't.
Re:US Gov copyright? by AxelBoldt · 2006-11-05 04:50 · Score: 1

Citations are still required, even for the work of Government officials.
By (often ignored) Wikipedia policy, which requires sourcing of all statements, but not by law.
Re:US Gov copyright? by DragonWriter · 2006-11-05 04:53 · Score: 3, Insightful

Err... I thought works of the US Government were generally free from copyright...?

(1) The Wyoming state government is not the US government: state government works are not generally free from copyright.

(2) Plagiarism is separate from copyright violation, anyway. Using material that is not subject to copyright or is in the public domain that is from one unique identifiable source without crediting the source is plagiarism, as is using copyright material in a way that does not violate copyright without attribution (say, fair use.) Plagiarism isn't a violation of the law, but a violation of commonly accepted standards of integrity when it comes to not claiming other's work as your own.
Re:US Gov copyright? by asuffield · 2006-11-05 06:04 · Score: 1

Only in theory. They figured out a way to work around that pesky law a long time ago - a private contractor is given the task of 'producing' the work, with assistance supplied by the government. "Assistance" here means that the government supplies all the people who do the actual work on it. The contractor then sells the copyright to the government. This little legal fiction results in a work that was produced entirely by government employees and using government funds, that is copyrighted and owned by the government. It's now standard practice in any case where the government thinks they might want to have copyright on something.
Re:US Gov copyright? by imsabbel · 2006-11-05 08:06 · Score: 1

But seeing that the policy of wikipedia FORBITS original reseach or works to be presented, i dont think that plagiatism isnt really that much of a violation here.
Everybody with half a brain can suggest that the knowledge didnt manifest itself out of thin air, even without citations given.

--
HI O WISE PRINCE. WHT TOOK U SO DAM LONG?
Re:US Gov copyright? by Lost+Race · 2006-11-05 09:19 · Score: 1

Why is plagiarism per se (of public domain material) a problem in Wikipedia? Plagiarism is claiming someone else's work as your own -- who is making the claim in a Wikipedia article, when all its content is effectively anonymous?
Re:US Gov copyright? by kalidasa · 2006-11-05 09:56 · Score: 1

It forbids original research from being presented, but also requires citations. Also, there's an important distincting between synthesizing research and simply copying text whole from another source. In both cases, there should really be a citation; but in an encyclopedia, the former without citation is not preferred, but not embarrassing; the latter is simple plagiarism in any genre. By "copying text," by the way, I don't just mean cut & paste, but even rewording. Synthesis is required, but regurgitation is frowned upon.
Re:US Gov copyright? by DragonWriter · 2006-11-06 03:27 · Score: 1

But seeing that the policy of wikipedia FORBITS original reseach or works to be presented, i dont think that plagiatism isnt really that much of a violation here.
Sure it is. If its plagiarized, its isn't verifiable because sources aren't cited properly and it is presented as original research (despite that being forbidden), so WP:VERIFY, WP:CITE, and WP:OR all come into play, whether or not it is also a copyright violation.

Biographical articles. by Anonymous Coward · 2006-11-05 04:16 · Score: 4, Funny

It's very lazy of of the Wikipedia authors to enter the same biographical information as other sites.
They should write new and interesting histories for all these people rather than using the same old worn out ideas that are on so many places on the net.
All it takes is a little imagination.
A new birth place, better achivements (why could hitler not have discovered the cure for cancer and be the first man on the moon? It's better than the depressing story on Wiki at the moment.) and some creative editing would solve this problem once and for all.

Some Wiki articles are already better and contain things about people that have never happened, but sadly these often get put back to the same old boring stories almost as soon as the changes are made.

Re:Biographical articles. by _Sprocket_ · 2006-11-05 07:27 · Score: 1

why could hitler not have discovered the cure for cancer and be the first man on the moon? It's better than the depressing story on Wiki at the moment.

I also understand he was responsible for trippling the population of African elephants during his lifetime.
Re:Biographical articles. by MMC+Monster · 2006-11-05 07:34 · Score: 1

I always found this biography on Hitler less dull than the one from wikipedia:
http://uncyclopedia.org/wiki/Adolf_Hitler

--
Help! I'm a slashdot refugee.

ok methodology, bad analysis by fermion · 2006-11-05 04:16 · Score: 1

In this kind of study, basing the conclusion on the presence of few hits would characterize the study as faith based science.

First, the sample size was 12,000. Where did that number come from? Were the samples picked randomly? Assuming so, is 12,000 a statistically an effective sample size? And if the samples are random, and the size is sufficient, is that 142 articles statistically significant, that is, are the number of matches outside the margin of error? In other words, does the sample size, selection, and methodology, merit a margin of error around 1%.

And then we get to the fact that sometimes wikipedia text is copied to other sites. This in itself leads to the conclusion that wikipedia has some credibility, even if unfounded. I found it interesting that we are not told how many articles off wikipedia were plagiarized. I also wonder what 'Wikipedia appeared to be the one plagiarized' means, and what systematic errors was introduced by that subjective judgement. Perhaps 1%?

There is no question that plagiarism is a big issue, and we all must watch for it. I am on the side that plagiarism in no more an issue than in the past, but with better communication and distribution, we catch it more. At some level, because it so easy to plagiarize now, we perhaps see more egregious cases of it.

What gets me is that an analysis of such low analytical value is news. I am once again amazed at how little people seem to know or care about proper logic. In the end all we know is that some study with questionable methodology produced 142 hits. Not a huge revalation, even if we stipulate the study is of even minimal value.

--
"She's a scientist and a lesbian. She's not going to let it slide." Orphan Black

Re:ok methodology, bad analysis by Skippy_kangaroo · 2006-11-05 08:41 · Score: 2, Informative

12,000 is easily enough to be statistically effective. Election polling gets acceptable results with samples of about 1,000.

Assuming that it is a binomial distribution then p=142/12000=0.0118, q=0.9882, n=12000 which means the standard error is sqrt(npq)=11.5 (approximately). Thus a 95% confidence interval is that the true number of plagiarised articles in the sample lies between 165 and 119.

And this is only plagiarism from on-line sites that are indexed by Google. Plagiarism from dead tree sources could well be significantly more.

This has got nothing to do with faith-based science and low analytical quality. I am once again amazed at how little people seem to know or care about proper statistics and just say "I don't believe it" if something doesn't accord with their preconceived notions.
Re:ok methodology, bad analysis by afaik_ianal · 2006-11-05 18:28 · Score: 1

No, this was was not a random sample. He has taken only biographies of people born before 1890. Furthermore, his sample of 12,000 is taken from a super-sample of nearly 17,000 articles. He weeded out 5000 articles because the text did not suit his extraction algorithm. Both of these are systematic biases.

Comment removed by account_deleted · 2006-11-05 04:19 · Score: 2, Insightful

Comment removed based on user account deletion

That is a very unreal scenario by cucucu · 2006-11-05 04:29 · Score: 1

Daniel Brandt wrote a script that pinpoints all the plagiarized pages. That is a very unlikely scenario in real life. He should have selected a sample of random pages, and check which are plagiarized.

Claim surfing the web is risky because his firewalls only gives access to phishing sites
Say sex is dangerous, because he frequents a nightclub were all members have STD
Assert numbers don't have square roots because his population is made of negative numbers.

Re:That is a very unreal scenario by Yetihehe · 2006-11-05 04:44 · Score: 1

Erm, he didn't select random pages, because he probably checked ALL the pages. And then selected those plagiarized. So if he selected only random subset, he would have smaller sample.

Either this, or I'm totally uninformed because I didn't RTFA (I'm just lazy).

--
Extreme Programming - Redundant Array of Inexpensive Developers
Re:That is a very unreal scenario by John+Hasler · 2006-11-05 06:40 · Score: 1

He did not check all the pages. He only checked 12,000. How did he choose them?

--
Warning: this article may contain humor, sarcasm, parody, and perhaps even irony. Read at your own risk.

Confused? by superstick58 · 2006-11-05 04:31 · Score: 1

I'm confused by the concept of plagiarism on wikipedia. For example, the article describes a biography copied from a government website. Isn't the point of Wikipedia to catalog and assemble information? How is copying an openly published biography from a government website considered plagiarism? Wikipedia is not being sold. No one is taking credit for the articles. Most cases, the original info is cited anyway. Anyway, please let me know what I'm missing here (which is probably a lot).

Re:Confused? by AxelBoldt · 2006-11-05 05:08 · Score: 1

Plagiarism is not a legal term, it's a term used in journalism and academia to describe taking somebody else's words or ideas and presenting them as your own, without attribution. In these realms, it is considered unethical.
If you copy somebody's words, and these words are not in the public domain (for instance because the author is long dead or works for the U.S. government), and you can't defend the use as "fair use", then it's a civil offense and they can sue you (in some countries and severe cases it's even a criminal offense). Whether you attribute the copied material or not is irrelevant for the legal status of copyright infringement.
Wikipedia is extremely vigilant in removing copyright infringements. Plagiarisms of public domain works are not considered that big a deal; Wikipedia policy generally requires sources for all statements but that is rarely enforced.

Even Virus authors contribute by tmk · 2006-11-05 04:31 · Score: 1

Authors of malware are trying to exploit the good reputation of Wikipedia to infect PCs with their malicious software. In a mass e-mail, recipients were told to download a "security update" for windows from a Wikipedia site.

The attackers had used a Wikipedia feature that archives all previous versions of articles when changes have been made. The malicious page thus continued to exist in the archive, and the attackers were able to point to it in mass emails.

See here , here and here.

Re:Even Virus authors contribute by Faylone · 2006-11-05 07:55 · Score: 1

and here?
Re:Even Virus authors contribute by tmk · 2006-11-05 08:14 · Score: 1

I submitted this two days ago with better sources and more details...
Re:Even Virus authors contribute by hawaiian717 · 2006-11-05 08:41 · Score: 1

There is a way to deal with this. If an article is deleted, the history gets erased. An administrator could copy the current, clean content of the article, delete the article, then recreate it from the clean version.

--
End of Line.
Re:Even Virus authors contribute by tmk · 2006-11-05 09:01 · Score: 1

That is neither allowed nor effective. Administrators can purge single revisions of an article and keep the rest. By erasing the whole article they would erase all informations abiout the article authors, whoich is not allowed bei the GFDL.
Re:Even Virus authors contribute by hawaiian717 · 2006-11-05 15:47 · Score: 1

Neat, I wasn't aware of the capability to purge individual revisions of an article. Agreed this is better than deleting and recreating the entire article, exactly because of the attribution issue.

--
End of Line.
Re:Even Virus authors contribute by Chuq · 2006-11-05 23:40 · Score: 1

Well, the way it is done it is kind of a bit of both. You can't delete individual revisions - you can only delete the whole article including all history. However once that is done, you can undelete selected revisions - for example, just the latest revision, or all revisions apart from one.

--
- Chuq

Okay, Brandt is learning. by WWWWolf · 2006-11-05 04:31 · Score: 1

This is how you fix the problems with Wikipedia: Point them out in a way that makes the problems easy to fix. Okay, it's probably still harder to get criticism against user conduct and policies reacted upon, but the way Wikipedia works, the content is still easy to fix. Especially in the case of plagiarism.

I really wish people would conduct accuracy and plagiarism studies a bit more often - especially when it's easy to fix, like this.

And by the way, Wikipedia recently got a bot that finds suspected plagiarism, which is pretty cool.

Turns out they weren't plagiarized... by cliveholloway · 2006-11-05 04:36 · Score: 1

They were just authored by Roland Piquepaille. His articles are always all his own work, so it must be a mistake in the program.

--
-- Trinity in high heels carrying a whip: The donimatrix - there is no spoonerism

Statically unsignificant by shareme · 2006-11-05 04:37 · Score: 1

Someone needs to brush up on stats.. Get back to us when the results are statistically analyzed to measure the results to determine are the results actually anything to worry about..

--
Fred Grott(aka shareme) http://mobilebytes.wordpress.com

Comment removed by account_deleted · 2006-11-05 04:37 · Score: 1

Comment removed based on user account deletion

How works the Wherebot? by tmk · 2006-11-05 04:39 · Score: 1

Is there a description how this bot identifies plagiarism? Does he search for random edits?

Re:How works the Wherebot? by WWWWolf · 2006-11-05 05:07 · Score: 1

Is there a description how this bot identifies plagiarism? Does he search for random edits?

I don't know if there's a really detailed description anywhere, and I'm not coffeed enough to find anything more, but the bot's user page says it searches for phrases found in new articles through Yahoo search API. So it may not be good for finding plagiarism that's been inserted to articles that are more than a few days old, I suppose, but it does help to find cases where people just copy-paste web stuff to new articles. Basically, this helps the new-page patrollers a lot.
Perhaps something to vet every edit would be cool =)

Wikipedia bashing du jour by mabu · 2006-11-05 04:46 · Score: 1

It seems to be th3 c00l3ss to bash Wiki lately, but the bottom line is there is no encyclopedic reference that comes close. The media and other pseudo-pundits who seem to resent any influential source of information that doesn't have obvious corporate influence (read: money-based control) as a major threat and they do whatever they can to discredit Wikipedia. Aside from a tiny subset of controversial articles that routinely get vandalized, and another tiny subset of plagiarism, this issue is likely to be blown way out of proportion by those who have a vested interest in destroying any information resource they cannot control.

Citations? by Ash-Fox · 2006-11-05 04:49 · Score: 1

I tend to check the citations on Wikipedia. If there is no citation and I can't find a somewhat reliable source on Google related to the information I'm looking at -- I know I can't trust that information.

These people who ramble on that Wikipedia is inaccurate almost appear to me like they never sat history class in high-school. Where you have to verify your sources.

I've also never heard of citing encyclopedias in research projects, ever. Good-grade coursework, also never seen them cite encyclopedia entries (they may cite information that was cited to on some encyclopedias).

--
Change is certain; progress is not obligatory.

Re:Link to Brandt's Site on Topic by Fnkmaster · 2006-11-05 04:50 · Score: 1

Sorry, but Brandt is a fucking nutjob. Just look around on his sites. That is not a stable, coherent person.

Re:Link to Brandt's Site on Topic by mabu · 2006-11-05 04:51 · Score: 1

The guy's got a 501(c)3 corporation dedicated to bashing Wiki. My guess is it's funded by media and other encyclopedia makers. Follow the money and what you probably will find out about these people is much more disgusting than any transgression on the part of Wikipedia.

Re:Plagiarism or Copyright? by DragonWriter · 2006-11-05 04:56 · Score: 1

Is plagiarism an issue for Wikipedia?

Yes.

ut legally, the real issue here is Copyright, isn't it?

Not all issues are legal issues.

There is no copyright in facts. Therefore, nonfiction works are open to have the facts used in Wikipedia. Where a verbatim transcription would not be fair use, someone needs to paraphrase.

The issue here is verbatim use, anyway. An automated script is going to have more trouble finding use of "facts" from another source that aren't verbatim copies of the presentation.

Re:Link to Brandt's Site on Topic by remembertomorrow · 2006-11-05 05:03 · Score: 1

Wow, you're right.

This guy is almost on the same level as Jack Thompson in terms of stupidity/ignorance.

--
Registered Linux user #421033

Here is the link to my report by Everyman · 2006-11-05 05:04 · Score: 1

Why is this news? Maybe because the Associated Press says it's news, and it's in hundreds of newspapers?

Why should Slashdotters care? Because while AP doesn't use links, Slashdot should have the courtesy of linking to the original sources that AP used to generate the report. (Plus AP also checked with Jimmy Wales for a reply, which is expected from professional reporters.)

The report is at http://www.wikipedia-watch.org/psamples.html

Wikipedia's own newsletter reports on it here:
http://en.wikipedia.org/wiki/Wikipedia:Wikipedia_S ignpost/2006-10-30/Plagiarism_cleanup

The efforts of Wikipedia administrators to clean up the mess are chronicled here: http://en.wikipedia.org/wiki/User:W.marsh/list

Of course, Slashdotters may continue shooting from the hip if they choose. It's what they do best.

Brandt vs. Wikipedia by mako1138 · 2006-11-05 05:07 · Score: 1

Brandt has a long-standing (well, year-old) beef with Wikipedia. You can read about it, ironically enough, in the Wikipedia article about him.

He got into a dispute because he didn't like having his biography on WP (though it was constructed from publicly available news sources). He was generally combative and belligerent, and so was blocked and banned various times; check out the Talk archives for details. Afterwards he started a webpage where he attempted to list the real-world identities of the editors involved in the dispute.

Brandt is also the guy responsible for outing the anonymous editor in the Seigenthaler controversy.

Re:Brandt vs. Wikipedia by bigbigbison · 2006-11-05 07:23 · Score: 1

I'd never heard of Brandt before this, but he sounds like an ass. His namebase.org website is ranked low in google, so he starts google-watch.org. He doesn't like his wikipedia bio, so he starts wikipedia-watch.org. Any bets on how long it takes him to start slashdot-watch.com????

--
http://www.popularculturegaming.com -- my blog about the culture of videogame players

Not an unflattering biography by iabervon · 2006-11-05 05:16 · Score: 1

Daniel Brandt is against Wikipedia's portrayal of him not because of it being unflattering (it is, in my opinion, if anything oddly sympathetic to his position, despite his position being that it shouldn't exist at all), but because of his privacy concerns. He's a privacy activist with a particular focus on the actions of information organizing sites, and so he's not unexpectedly against the existance of unauthorized widely-available detailed biographies. He's gone so far as to complain about CIA and NSA websites using cookies, so it's not surprising that he wouldn't be happy about a vast conspiracy to produce reports on unwilling individuals, regardless of the merits of the reports.

Wikipedia is now digg.com, without the credit! by dotancohen · 2006-11-05 05:21 · Score: 1

This isn't surprising, seeing how _anybody_ can edit wikipedia. The inability to verify has always been an issue with wikipedia. Furthermore, I'm sure that most of these 'incidents' could be rectified by simply changing a few words and then referencing the source webpage. Then, instead of it being plagerism, it would be accountable reference work.

Bah.

http://what-is-what.com/what_is/digg.html

--
It is dangerous to be right when the government is wrong.

I don't know by khallow · 2006-11-05 05:24 · Score: 1

Is it such a good idea, checking for and reporting plagiarism? While that takes dedication and hard work, it's notable that he feels the need to present it as criticims of Wikipedia's model, because in fact he's demonstrating the power of plagarism from many people with different motivations. Even if the motivation is anti-Wikipedia, Wikipedia just absorbs the input and grows stronger. That doesn't seem a good thing.

Re:I don't know by tgv · 2006-11-05 07:19 · Score: 1

Generally speaking, checking for plagiarism and reporting it, is a sound idea. It does not take a lot of dedication and hard work. It's understandable Brandt presents the cases as criticism to Wikipedia, but it does not cripple it. With every slash, Wikipedia grows stronger. It won't be long before many people with different motivations will be told they are plagiarizing Wikipedia.

Brandt's paper and Wikipedia's response by AxelBoldt · 2006-11-05 05:24 · Score: 1

Brandt's original paper is here, explaining his methodology and giving the complete list of articles he found. Wikipedia's response is here, where people go through the list one by one and also check the other contributions of users who have added copyrighted content. Wikipedia also has a bot which aims to detect newly added copyright violations by searching Google.

Re:Brandt is a Republican by MostAwesomeDude · 2006-11-05 05:25 · Score: 1

I'll bite, mostly because people might actually believe what you're saying.

Daniel Brandt doesn't like Wikipedia. His article there was started 'against his wishes,' and although he managed to get it deleted once by a few choice threats. it was quite rapidly created again. Ironically, the community now agrees that his anti-Wikipedia rantings have made him notable enough to be included in the encyclopedia.

Mr. Brandt is certainly not a nice person. While your words "politician" and "Republican" are completely unfounded, it is true that Mr. Brandt maintains a web page chock-full of personal data, including the names and addresses of any Wikipedians who he feels have been mean to him in the past.

The interesting part of all this is that Brandt does not have the authority to order Wikipedia to remove content. That kind of copyright enforcement can only be carried out by the copyright holder. However, he is well aware that Wikipedia's "no copyright violations" policy requires users to immediately quash plagarized content.

--
~ C.

Not True by viewtouch · 2006-11-05 05:33 · Score: 1

Daniel Brandt can't edit Wikipedia so it's not true that anyone can edit it.

142 isn't bad. by Maxo-Texas · 2006-11-05 05:35 · Score: 1

It's great this guy created a program to make it easier for them to avoid this problem.

That's the great thing about open source and projects like wiki.

You encounter a problem, it's very easy for people to fix it quickly.

If those 142 items are real, they are probably already being fixed now if not all fixed.

--
She was like chocolate when she drank... semi-sweet at first and then increasingly bitter.

Re:1% plagarism! by makomk · 2006-11-05 05:36 · Score: 1

That's a very interesting allegation. Got a source for it?

Is this sample as biased as Wakeman from 151? by tepples · 2006-11-05 05:43 · Score: 1

Does the article make any claims as to how Mr. Brandt chose the sample of 12,000 articles? How can we look for biases in the sample?

Re:Is this sample as biased as Wakeman from 151? by Salmar · 2006-11-05 06:35 · Score: 1

That's not as important as the fact that even though he took a very large sample of articles, and even if the sample was biased, Wikipedia itself has more than 100 times that many articles, and so we should still expect more were those 142 came from.

On the other hand, if the sample was a good representative of the site's content, it's still less than 1% of actually plagiarized material, which is very reasonable if one considers how volatile Wikipedia is.

This is not black and white. The point is that Wikipedia is, for the most part, a reliable source of information, and where it isn't is usually obvious.

--
This is not the signature you're looking for.
Re:Is this sample as biased as Wakeman from 151? by Paradise+Pete · 2006-11-05 12:30 · Score: 1

it's still less than 1% of actually plagiarized material
It can be seen at a glance that 142 out of 12000 is obviously more than 1%, since 142*100 == 14,200.
Re:Is this sample as biased as Wakeman from 151? by Salmar · 2006-11-06 02:35 · Score: 1

Follow the link, my friend.

--
This is not the signature you're looking for.
Re:Is this sample as biased as Wakeman from 151? by Paradise+Pete · 2006-11-11 04:50 · Score: 1

Follow the link, my friend.
But the link is in error. Plagiarism is plagiarism, no matter what the source, even if it's public domain.
Re:Is this sample as biased as Wakeman from 151? by Salmar · 2006-11-11 11:34 · Score: 1

Plagiarism is plagiarism, no matter what the source, even if it's public domain.
I beg to differ!

From Public domain:
"Public domain comprises the body of knowledge and innovation ... in relation to which no person or other legal entity can establish or maintain proprietary interests within a particular legal jurisdiction. This body of information and creativity is considered to be part of a common cultural and intellectual heritage, which, in general, anyone may use or exploit, whether for commercial or non-commercial purposes."

--
This is not the signature you're looking for.
Re:Is this sample as biased as Wakeman from 151? by Paradise+Pete · 2006-11-11 13:53 · Score: 1

Of course you can use it. But if you claim it as your own, that's plagiarism. Copyright is an entirely separate issue.
plagiarism - the practice of taking someone else's work or ideas and passing them off as one's own.
Re:Is this sample as biased as Wakeman from 151? by Salmar · 2006-11-11 16:49 · Score: 1

Of course you can use it. But if you claim it as your own, that's plagiarism. Exactly; it's a good thing that Wikipedia's content is also in the public domain, not claimed by any author.

--
This is not the signature you're looking for.
Re:Is this sample as biased as Wakeman from 151? by Paradise+Pete · 2006-11-12 10:34 · Score: 1

Give it up. You're just being stubborn. If wikipaedia is using verbatim text without attribution it's plagiarism, plain and simple. that is *exactly* what plagiarism is.
Re:Is this sample as biased as Wakeman from 151? by Salmar · 2006-11-12 11:00 · Score: 1

I'm sorry if I'm making you upset, but I am trying to be as accurate as possible. My understanding of the meaning of plagiarism, with which I think many dictionaries and encyclopedias agree, is the practice of copying material AND claiming it as one's own. It is my understanding that neither Wikipedia, nor any of its contributors, lays any claim of authorship over its content. Thus, although it would be most courteous to attribute any content copied from the public domain, they are not obligated to do so.

--
This is not the signature you're looking for.
Re:Is this sample as biased as Wakeman from 151? by Paradise+Pete · 2006-11-13 02:13 · Score: 1

Thus, although it would be most courteous to attribute any content copied from the public domain, they are not obligated to do so.
Obligated in what sense? There's no legal obligation - that's not what plagiarism is about. I can go around claiming I wrote "Four score and seven years ago" 'til I'm blue in tha face and nobody's going to throw me in jail or sue me. But it's unethical - it's plagiarism. This an ethical issue, not a legal one.

Re:Linux FAQ by chaoticgeek · 2006-11-05 05:46 · Score: 1

HUH? I did not see anything that was true in that little statement.

--
hello

Brandt's odd sayings by clap_hands · 2006-11-05 05:50 · Score: 1

"They present it as an encyclopedia," Brandt said Friday.

Well, yes. Not that odd, really, given that it is an encyclopedia.

"They go around claiming it's almost as good as Britannica."

Actually, Wikipedians don't, in my experience. Most are quite sober when it comes to comparisons with Britannica. Brandt may be referring to the journal Nature, which did make such a claim for science articles.

They are trying to be mainstream respectable.
Wikipedia is already pretty darn mainstream, and if by "respectable" Brandt means "free of plagiarised material", then he's correct.

Well by 1310nm · 2006-11-05 05:50 · Score: 1

I hope they didn't count the sites that mirror Wikipedia's content as their own (answers.com I think it is that is notorious).

If you equate good with referenced by Kjella · 2006-11-05 06:03 · Score: 1

"They present it as an encyclopedia," Brandt said Friday. "They go around claiming it's almost as good as Britannica. They are trying to be mainstream respectable."

Whether something is plagerized or not, doesn't really impact the quality of it. If someone copied a great article into Wikipedia, then Wikipedia has a great article - just through foul play. There's previously been comparisons which have shown Wikipedia to be just as accurate as Britannica. Now, it's been a while since I looked at a dictionary, but from what I can tell Wikipedia has far more external references than your average encyclopedia. I guess mostly because a wikipedia page has little credibility on its own. So both as a reference and as a starting point it's better, what remains is just whether it's "respectable". With 99% own content, you can hardly say they've been using this as a strategy. I don't know what you could compare it with, it's as if one linux app copied some code, and someone called Red Hat not respectable for distributing it despite being completely unaware. Or better yet, tried to imply that their business is built on stolen software. What's next? "I can find text copied without permission on google. They're eeeeeeeeeeevil"?

--
Live today, because you never know what tomorrow brings

victimless?!?!? by abigsmurf · 2006-11-05 06:07 · Score: 1

Victimless crime?

You're not only not buying the book of whoever did the (possibly expensive) research, you're not even crediting them so they get zero credit and because you've got the info you need you're even less likely to seek out the author's work! Just because the perpatrator(sp?) has little to gain commiting the crime doesn't make it victimless!

Re:victimless?!?!? by MarkByers · 2006-11-05 06:43 · Score: 1

Yeah because if you took Wikipedia offline I would immediately go out and buy tons of reference books.... not!

If you are the sort of person that needs to buy expensive research papers, you are not in the target audience for Wikipedia! Wikipedia is not intended to be used for professional research, it's just a little fact book that may or may not be correct, with some links to sources on each page. Nothing more. It's not going to be making a dent into your sales figures, so relax!

If you support Wikipedia, maybe you can even think of a way to benefit from it, you never know...

--
I'll probably be modded down for this...
Re:victimless?!?!? by abigsmurf · 2006-11-05 07:26 · Score: 1

you may not buy his books but you may come across his article on a site he writes for...

ORIGINAL ARTICLE by h2g2bob · 2006-11-05 06:08 · Score: 1

Source of story here:
http://www.wikipedia-watch.org/psamples.html

Even there I don't think there's enough information to actually make a judgement on this. What algorithm did he use. Where are they plagerised from? How can you tell who copied who (there's a LOT of plagerism FROM wikipedia)? Where's the data? How did he select the articles?

In short... show me the evidence.

Re:ORIGINAL ARTICLE by h2g2bob · 2006-11-05 06:20 · Score: 1

In fact, this Wikipedia article on Daniel Brandt makes interesting reading. It's surprisingly long, I hope that doesn't mean it's plagerised from somewhere ;-D
"I don't regard him as a valid source about anything at all, based on my interactions with him ... He considers the very existence of a Wikipedia article about him to be a privacy violation, despite being a public person. I find it hard to take him very seriously at all. He misrepresents everything about our procedures, claiming that we have a 'secret police' and so on." - Jimbo Wales
That about says it all
Re:ORIGINAL ARTICLE by interiot · 2006-11-05 07:12 · Score: 1

He lists where they're plagiarized from on his website... click on each article and read the box at the top.
He's got a bit more information at these threads: [1], [2], [3] I don't agree with his conclusions, but he said he did put around three weeks of effort going over these by hand to make sure they were legitimate copyvios.
Re:ORIGINAL ARTICLE by Iron+Condor · 2006-11-05 11:07 · Score: 1

In short... show me the evidence.
Coming from a member of the wikipedia cult, this is bold. Wikipedia insits that there never needs to be any evidence for or against any claim, that anything anybody can type is somehow valid and that truth is decided by who has the most time on their hands to change other people's writing.
Once you have decided that truth is not decided by evidence, you cannot turn around and require that your critics somehow show evidence when they accuse you of something.
If we made evidence the arbiter of truth, 95%+ of Wikipedia would vanish into the thin air from whence it was pulled in the first place. Yes, ninety-five percent: it's true because I said so and it is Wikipedias claim that things must be true if someone types it.
Such is the danger of publicly disavowing any kind of standard of scholarship: You don't get to question the standard of scholarship of those who question you on anything.

--
We're all born with nothing.
If you die in debt, you're ahead.
Re:ORIGINAL ARTICLE by Planesdragon · 2006-11-05 12:01 · Score: 1

Once you have decided that truth is not decided by evidence, you cannot turn around and require that your critics somehow show evidence when they accuse you of something.

Truth is not now, and never has been, decided by evidence. It has been decided by whomever has made the most cognizant argument regarding the proper interpretation of evidence provided. In wikipedia's case, it's done via reference. In the case of a field of science, it's done via experimentation and journals. In the case of a trial, it's done by three experts in the law presenting a case to a group of 6 or 12 laymen.

When you get right down to it, Wikipedia's got exactly what they should have for a web-based encyclopedia. They're far better than random web searches, and significantly broader than the "authoritive" dead-tree encyclopedias.
Re:ORIGINAL ARTICLE by h2g2bob · 2006-11-05 14:11 · Score: 1

Oops, sorry. Couldn't see the wood for the trees :(
Re:ORIGINAL ARTICLE by mdwh2 · 2006-11-06 02:45 · Score: 1

Wikipedia insits that there never needs to be any evidence for or against any claim, that anything anybody can type is somehow valid and that truth is decided by who has the most time on their hands to change other people's writing.

Where does Wikipedia say this?

Wikipedia actually makes it clear that it is not a matter of deciding truth or not - the requirement for inclusion is verifiability
Re:ORIGINAL ARTICLE by DragonWriter · 2006-11-06 03:42 · Score: 1

Wikipedia insits that there never needs to be any evidence for or against any claim, that anything anybody can type is somehow valid and that truth is decided by who has the most time on their hands to change other people's writing.

Uh, no, Wikipedia doesn't "insist" that, in fact it insists quite the opposite. If material is just something some random person typed without a reliable source, its "Original Research" forbidden by WP:OR and WP:VERIFY, and where there is a clash between "reliable sources", has a number of procedures designed to resolve matters other than by edit wars, including (among others) page protection, procedures to attempt to achieve consensus, and the Arbitration Committee.

Such is the danger of publicly disavowing any kind of standard of scholarship: You don't get to question the standard of scholarship of those who question you on anything.

Er, yeah. Too bad for your argument that it is readily verifiable that Wikipedia doesn't disavow any standards of scholarship. You may not like the manner in which they enforce or police the standards they have established, but its an outright lie to say they have "publicly disavowed any kind of standard of scholarship".

Other concerns about Wikipedia by meburke · 2006-11-05 06:08 · Score: 1

In the Encyclopaedia Britannica and other published, for-sale reference works, the articles' sources are not only attributed, but the author of the article is attributed and his/her credentials displayed as a guide to their qualifications in providing the article.

Now, an article presenting facts can be written by someone who has no academic qualifications but still represents the facts fairly and accurately, so I don't claim that a person MUST be academically qualified to write a good article, nor do I claim that an article is good just because a person with "academic" qualifications writes it. However, I believe that the articles' authors should be identified, and the article parts should be identified as primary, secondary or tertiary.

I go to the Wikipedia for information, but I'm cautious. I want to be able to cite the information in the Wikipedia, and that requires authors and accurate attribution.

--
"The mind works quicker than you think!"

Re:Other concerns about Wikipedia by Ciarang · 2006-11-05 08:44 · Score: 1

I go to the Wikipedia for information, but I'm cautious.

Yes, good. That's what it's there for, and what you need to be.

I want to be able to cite the information in the Wikipedia, and that requires authors and accurate attribution.

It's irrelevant what you want, you can't expect that - it's a wiki. The reply from the AC also makes a good point, despite the unnecessarily offensive tone.
Re:Other concerns about Wikipedia by meburke · 2006-11-05 09:16 · Score: 1

You are correct. I cannot expect the same standards of excellence from the Wikipedia that I can expect from the Britannica. It is a wiki, but the deficiencies of the Wikipedia can be ameliorated by a good editorial policy. I run into lots of product-oriented wikis that have very high standards and are very useful to their users.

And although I seldom reply to a**holes, I agree that the AC had a point, so I indulged myself in a reply.

--
"The mind works quicker than you think!"
Re:Other concerns about Wikipedia by epine · 2006-11-05 09:25 · Score: 1

I want to be able to cite the information in the Wikipedia, and that requires authors and accurate attribution.

I think you are suffering from a simmering all things to all people syndrome. The world already has a enormous body of emminently-citable published work amassed over a period of centuries. Should Wikipedia take on the mandate to refactor the whole of human knowledge? That doesn't sound prudent to me.

What the world lacked was an instant-gratification synopsis to the existing body of world knowledge. I don't think credentialism is required to deliver in that niche, and I would be more inclined to suspect that credentialism is toxic rather than complementary to what Wikipedia is trying to achieve.

I've given some thought to the origins of credentialism in human culture. In part, credentialism was a response to bandwidth constraits: rather than attempt to communication the entire process engendering the final result, only the final result is put forward, but with a seal-of-approval from the credentialed sect which serves to state this spinach is of good and proper origins.

Now that we have the technical instrument to convey all the history, all the time, to all the people, why would we fall back on credentialism again? Because we're used to it? Because those who invested in their credentials are determined to maximize their return-on-investment? At what point do those holding credentials work more in the sake of perpetuating their own interests than the interests of society at large?

I've always preferred a good question to a good answer. My feeling is that the Wikipedia is presently at the stage of posing some good questions. One of those questions concerns trust gradients. In the whole of the animal kingdom, if a creature comes across something that might possibly be edible, the creature first considers whether to put it in its mouth (usually by means of the instantaneous smell test).

Why is it that we presume the average person is incapable of conducting a simple smell test when encountering information on the Wikipedia? I think it stems from the pre-internet educational culture where everything written on the blackboard was reflexively ingested and regurgitated on demand.

For the post-internet generation I suspect this bias will prove far less strong. It's not possible (that I can fathom) to grow up from the age of primary school in the internet generation and not learn that the internet is full of information you don't put directly into your mouth without first a moment of thought. Wouldn't it be nice if this ultimately extended to the evening news (such as it survives)? Of any information source out there, it strikes me that the evening news deserves the most scrutiny of all.

The average Wikipedia article introduces the entire slate of keywords required to plumb google of 90% of the material available on the internet concerning the topic in question (the other 10% requires serious google-fu), plus the whole of the article's creation history, plus all the online debate about the merits of one account over another. And search will only improve over time.

I'm coming to the rather strong position that credibility is (or ought to be) an active process in the mind of the reader than a dull and comforting vestige in the identities of the authors. Before Wikipedia, this wasn't a practical approach, so what we are dealing with here concerning credentialism is proof by incumbency rather than passing any sensible smell test.

Big tobacco managed to delay the inevitable for several decades waging a war of credentialist propoganda. There's no level on the trust pyramid where you don't have to trust your nose and we shouldn't be conveying blind faith at any level of the process.

Plagarism is common but usually promotional by Animats · 2006-11-05 06:23 · Score: 1

Plagarism shows up frequently in Wikipedia, but usually it's promotional. Typically, company X copied their "about" page into Wikipedia. Bands and musicians, usually ones that are a legend only in their own minds, try this. A new user associated with the thing being promoted is usually responsible.

Then there are the people with a collector mindset. They create endless minor articles like "Indiana State Highway 22" and biographical articles of long-forgotten city council members. Often by cutting and pasting. This is annoying, but complaints of copyright infringement are unlikely.

Ug another techono geek tries to prove he's smart. by swalters1 · 2006-11-05 06:42 · Score: 1

Alright, we've seen this before. Someone writes a prgoram to prove something, then runs it, dances around with the results and says, "Hey look over here, I proved that Wikipedia is stealing info from other sites..." But he leaves off his results, how the code works or how he verified that his results were accurate. What if the site that he's crediting the source material with was really the one who stole it in the first place? How many generations and cross matches did he perform? Is it per word and ordered matching, or does it consider the phrases: "We went to work today" and "Went to work today" to be same?
If you'ver going to run an article like this and expect people to take it seriously, we need details. LOTS OF DETAILS.
Does my comment mean I don't think some of the content is uncredited, or stolen? No, it probably is, but anytime anyone presents what amounts to an experiement, it should be held to a scientific standard and subject to peer review, otherwise you end up with a bunch of people thinking something that is fact, is not, and something that is not fact is. People need to be reminded to think critically when we see articles like this, or any article that makes a claim based on "my research" or "my program". Just because you made an experiement that proves your hypothesis, doesn't mean it proves anything. I want more details, I want to review his findings, I want to review his process, and I want to see how deep he dug before he claimed that something in the public domain was actually not credited to its source.

How does he know? by mr_zorg · 2006-11-05 06:45 · Score: 1

If I wrote an article on some subject and then decided to share that information with Wikipedia, I may well just copy my text verbatim. Does that make it plagiarism? If I wrote the text, why can't I reuse it? How does this guy know that's not what's going on here?

Re:How does he know? by interiot · 2006-11-05 07:02 · Score: 1

It's at least internal Wikipedia policy that there needs to be verification that the original author is posting the article (either by modifying the original site to note that the article is released under the GFDL, or by sending an email to the Wikimedia Foundation confirming its GFDL status). Without more formal confirmation, it's difficult to say whether the off-wiki author is the same as the on-wiki one, either from a plagiarism standpoint or a legal one.

Re:1% plagarism! by goombah99 · 2006-11-05 07:08 · Score: 1

cute but this has nothing to do with plagiarism. Press releases are meant to be copied.

--
Some drink at the fountain of knowledge. Others just gargle.

Wikipedia less than perfect... by maxume · 2006-11-05 07:11 · Score: 1

Can it still move forward? More at 11.

--
Nerd rage is the funniest rage.

Re:Brandt is a Republican by DragonWriter · 2006-11-05 07:25 · Score: 1

However, he is well aware that Wikipedia's "no copyright violations" policy requires users to immediately quash plagarized content.

How can one be "well aware" of something that isn't true? Wikipedia's copyright policies (WP:C and WP:COPYVIO) address copyright violations, not plagiarism. You can have a copyright violation without plagiarism—for instance, if the use of properly quoted, properly cited material exceeds legal "fair use", it is not plagiarism while it is a copyright violation. And you can likewise have plagiarism without copyright violation—for instance, if material is not subject to copyright (perhaps its a US government work) but is used without attribution and presented as someone else's work, it is plagiarism, but is not a copyright violation.

Wikipedia doesn't have a policy on "plagiarism", per se, AFAICT, though WP:VERIFY and WP:CITE are relevant to the issue.

Re:Truth is a POV and unacceptable at Wikipedia. by Sloppy · 2006-11-05 07:40 · Score: 1

When are you truthists going to get over your zealous bias, and at least accept the falsists' right to exist, even when you don't agree with them?

--
As copyright owner of this comment, I authorize everyone to defeat any technological measure which limits access to it.

Who is Daniel Brandt anyway? by YGingras · 2006-11-05 07:51 · Score: 1

You might like to know that Daniel Brandt founded Google Watch back in the old days to protest against page rank. Yes, Google Watch was originally just against how Google didn't give mr Brandt a good page rank. Now he added some bits about privacy but I think anyone should visit Google Watch now to see how childish Daniel Brandt is. And using Google to do datamining is against the acceptable use policy anyway.

Re:1% plagarism! by nbauman · 2006-11-05 07:54 · Score: 1

>Any Journal article comprised of 1% plagiarism would be subject to law suits, apologies and the journal would face ostracism.

Am I correct in assuming that you pulled that 1% number out of the air? If not, could you give me a source?

Remember, 60% of all statistics are wrong.

Depends on what you're writing by benhocking · 2006-11-05 08:02 · Score: 1

IIRC, at least 2/3 of what you write should be your own conclusions, described in your own words, with the bulk of the rest expected to be comprise conclusions reached by others, but described in your own words. Direct quotations should not make up more than a very small part of any academic paper.

If you're writing a summary article (e.g., on the current state of data mining), then as little as 10% (or even less) could be your own conclusions. However, if you're writing about your own research, then you definitely want most of it to be your own conclusions. (The first paragraph of what you wrote, however, is spot on.)

--
Ben Hocking
Need a professional organizer?

Re:Depends on what you're writing by WNight · 2006-11-05 10:54 · Score: 1

And that's really only relevant if you're doing it for a grade. If I could replace a week-long research project for one of my clients by simply pointing to a URL (quoting) and writing a paragraph or two describing the relevance to them they'd love me.

And as for school, I've long thought that we should make direct copy&paste legal in essays, if attributed (ie, as a quote), and simply give more marks for original content. This way many people who don't have the skills for original composition would still learn how to write an essay or business case because they'd have gotten some marks for what works in the real world, rather than a failing grade for ivory tower reasons.

Re:Brandt is a Republican by MostAwesomeDude · 2006-11-05 08:14 · Score: 1

Heh, and usually I'M the pedant. You're right, that should read "...quash copyrighted content used improperly. Thanks.

--
~ C.

correct arithmetic is 1.2% by gdavid · 2006-11-05 08:24 · Score: 1

It's not out of 1 million, its out of the 12,000 he examined. That comes to
1.2%,which is still pretty good,

Re:1% plagarism! by goombah99 · 2006-11-05 08:31 · Score: 1

147/12000 = 1.2%

--
Some drink at the fountain of knowledge. Others just gargle.

Especially since he used to sell such info himself by Reziac · 2006-11-05 08:57 · Score: 1

From the Wiki article:

"From the 1960s onwards, Brandt collected clippings and citations pertaining to influential people and intelligence matters. In the 1980s, through his company Micro Associates, he sold a database of citations of these clippings, books, government reports, and other publications."

Pot, kettle, hello.....??!

--
~REZ~ #43301. Who'd fake being me anyway?

Verifiability by tepples · 2006-11-05 09:07 · Score: 1

Wikipedia's job is to provide accurate information.

Not exactly. The job of Wikipedia (or for that matter any other general encyclopedia) is to provide verifiable information from reliable sources. Verifiability > truth until the truth becomes verifiable.

Re:You want to cite an encyclopedia entry? by meburke · 2006-11-05 09:09 · Score: 1

Sorry, maybe I should have been a little clearer for the comic book intellectuals out there: I wouldn't write a scientific paper citing an encyclopaedia entry, but in discussion I like to be able to report where I obtained information without having to discount the information as a frivolous source. I can easily say, "I read an article in Encyclopaedia Britannica, 14th Edition, by T. E. Lawrence, regarding Guerilla Warfare. In the article, he said,...etc., etc." Now go look up the article on Neuro-Linguistic Programming in the Wikipedia, read the notes and rhetorical bullshit, and you will realize it was mostly written by offensive egomaniacs (I presume you can relate), with highly biased views, little expertise in NLP, and not much intellectual integrity. The article is useless, even as a starting point for discussion.

The problem as I see it, is that the founders of the Wikipedia would like to have it regarded seriously as an information source, but the editorial standards are too low. Identifying the qualifications of the authors would allow us to ignore articles written by raving idiots like yourself.

--
"The mind works quicker than you think!"

So? by crhylove · 2006-11-05 09:46 · Score: 1

Wikipedia is a free and open source of information for all people. I'm all for it if they get decent data off of somewhere else. I'd like it if it was more properly bibliographized or whatever, but that seems like small potatoes over the background of what it is that wikipedia is attempting to be for the world.

What stupid text book industry shill came up with this crack pot survey? And as somebody else pointed out ~1% of plagiarism isn't exactly high in my opinion.

Stop being an asshole, and if you DO find plagiarism, label it as such, and give us a better footnote as to where the original information came from.

Jesus!

rhY

--
I hold very few opinions. I hold information based on observation and fact. If you wish to disagree, please use facts.

Re:Linux FAQ by masterzora · 2006-11-05 09:57 · Score: 1

I know, I know, "Don't feed the trolls", and I know this a typical copy-paste troll, but I just have to say this:

Do you realize that that is self-contradictory a few times? Such as when it builds up trying to say Linux is painfully difficult to use and then near the end it states that you only need a little more knowledge to use it? It can't be both! This is why anti-Linux people on /. are generally seen as trolls: they usually are.

--
Remember, open source is free as in speech, not free as in bear.

DANIEL BRANDT = JACKASS by hkmwbz · 2006-11-05 10:20 · Score: 1

Sorry for shouting in the subject, but I am surprised that no one has pointed this out before. Brandt is nothing but a jackass with a personal agenda - to get back at Google for not ranking his lame ass page higher, and Wikipedia for revealing things about him. Good read, by the way:

Google Watch Watch

--
Clever signature text goes here.

Only one of the papers I linked to was for a grade by benhocking · 2006-11-05 11:02 · Score: 1

The other was submitted to (and accepted by) a journal. Submitting to a journal, of course, is more like submitting for a grade than doing a project for a client.

--
Ben Hocking
Need a professional organizer?

I'd say this numerically proves its superiiority by skogs · 2006-11-05 11:02 · Score: 1

only 140 something out of 12,000 articles were plagarized?

Wow.

I seem to remember from high school that the major dictionaries sometimes put made up words in their dictionaries in order to catch plagaristic competing dictionary makers. Similarly they'll add an extra fake definition to a word, and then watch over the next decade or two to see if another dictionary picks up the fake definition.

Encyclopedias do the same. Add some small tidbit of fake information to an article to see if it surfaces somewhere else.

I don't believe the dictionary and encyclopedia publishers do this by accident...they do it because they have experienced such things before and found this to be a very easy way to prove stupidity and plagarism on the other person's part.

Honestly...I think less then 200 our of over 12,000 articles is actually proving that it is quite good...and indeed non-plagarised. Especially considering that wiki articles tend to be significantly longer, more in depth, and with more recent and politically charged items in it...I think it proves quite a large degree of integrity on wikipedia's part.

--
Who is this that even the wind and the waves obey Him? Surely this computer must submit also!

Irony, pick up the white courtesy phone... by DrLazer · 2006-11-05 11:27 · Score: 1

Am I the only one who finds it vaguely humorous that the same Daniel Brandt who wants to bust Google's balls has to use Google in his campaign to bust Wikipedia's balls?

--
If it wasn't for half of the people in this country, the other half would be all of them -- Col. Stoopnagle

Re:1% plagarism! by Achromatic1978 · 2006-11-05 11:27 · Score: 1

Yeah, except for Wikipedia, who is getting credit for being a fountain of knowledge and research that, well, wasn't researched by them? There's money is coming into Wikipedia from somewhere.

I have a problem with this part of the article... by The+Slaughter · 2006-11-05 11:57 · Score: 1

But editors found extensive problems in several cases, with many still not yet fully checked. Articles with offending passages have been stripped of most text. An entire paragraph in Alonzo Clark's entry, for instance, was deleted, leaving the article with the bare-bones: "Alonzo M. Clark (August 13, 1868-October 12, 1952) was an American politician who was Governor of Wyoming from 1931 to 1933." The original article, Brandt said, was copied from a biography on the Wyoming state government site.

Uh, OK, but there really isn't anything wrong with that, as long as it's cited. If it's on the Wyoming state government site, it's public domain. Works that the government puts out should all be public domain. (Taxpayers .. paid for them. There is no copyright.)

Re:1% plagarism! by sg_oneill · 2006-11-05 12:28 · Score: 1

RTFA (Read the fucking article!)

However, I can guarantee the plagarism rate is higher than 1% in academic journals. Pretty much any *honest* academic will tell you its rife, particularly amongst stressed out PhD candidates

--
Excuse the Unicode crap in my posts. That's an apostrophe, and slashdot is busted.

Re:I have a problem with this part of the article. by imthesponge · 2006-11-05 12:46 · Score: 1

You'd assume so, but copyright law only explicitly excludes works by the U.S. federal Government. States and other local governments can and often do claim copyright on their work. http://cendi.dtic.mil/publications/04-8copyright.h tml#30

Something I just plagiarised... by TheVelvetFlamebait · 2006-11-05 13:21 · Score: 1

"When you steal from one author, it's plagarism. When you steal from many, it's research."

- Wilson Mizner

--
You know, there is a difference between trolling and pointing out the flaws in your reasoning. Just saying.

Re:re-read the post by meburke · 2006-11-05 13:35 · Score: 1

If you would have read the post, you would have noticed that I don't believe credentials are sufficient for evaluating information. And if you had read the response to the AC, you would have seen the clarification on cites. I think you have jumped to a conclusion not warranted by the text. In fact, looking over your post, I see you have managed to exhibit 19 of the 83 common rhetorical fallacies (a couple more than once), and still miss the point: The founders of the Wikipedia hold the Wikipedia to be a reputable source of knowledge and information, yet some authors do not reveal their own expertise or biases. In fact, we have numerous instances where the authors have battled to change the content of articles to promote their own point of view without disclaimer, or to hide information inimical to their particular point of view. This problem, (I define a problem as a discrepancy between the way things are and the way I want them to be), is, IMO, something that could be resolved by a higher editorial policy. The Wikipedia is only a place to begin discussion, but it is not a reliable place to start discussion at this time. Although the majority of the articles have additional links, many times the links are selected for their bias rather than their objectivity.

Your point about trustability is well-taken. If I read something in the Wikipedia I want to know if it is a fact or an opinion. If it is a fact, I want to determine the probablility, "Is it true?" If it is an opinion, I want to know what the arguments are for and against the opinion. An argument should stand on its merits, but if I know a particular author of an article about Capitalism is a high-ranking member of the Socialist Workers Party, I am warned to examine the arguments a little closer. One strength of the Wikipedia is that a very biased article will probably get ammended or challenged pretty quickly. However, "article by consensus" is not really an unbiased source of information either.

As for the post-internet generation: I have seen no evidence that the internet has improved thinking and independent thought. Although somewhat of a ranter, John Taylor Gatto ( http://www.johntaylorgatto.com/ ) has pointed out the deficiencies of our educational system. My interpretation of his writing is to conclude that the education system, particularly public schooling, suppresses independent, creative and objective thought. What I see personally on the internet, is a proliferation of what Sociologists term "crowd behavior", in which the intelligence exhibited by the crowd performs significantly below the average intelligence of the individual members. YMMV.

--
"The mind works quicker than you think!"

Plagiarism the myth the fact. by CherniyVolk · 2006-11-05 14:18 · Score: 1

I hope this isn't modded as flame bait, becuase I'm really being honest here in my argument. So, I'll get down to it.

First, American education enforces plagiarism. That's right! How so? Well, take for instance the fact that almost every test in any mundane American education facility almost always encourages the student to regurgitate a canned answer from a designated source of information. It gets even worse when you enter the University level, and is unbelievably worse yet, if you enter any top tier Univeristy (where the professors themselves demand you buy *their* book).

Even if such a class exists, as "Critical Thinking", there's really nothing truely critical about it. Factor into the above facts with another fact that American and European societies are bent on "Political Correctness". This only serves to deter true critical thinking, becuase any deterance or compelling factor to NOT speak you mind, regardless of how vulgar it is, is taking away from the full spectrum of perceptive analysis. Even acadamia is infected with this little bit, that's why you never see any books dedicated to the good things about Hitler, the bad things about Ghandi, even though any person in their right mind shall admit, even if in private, that Yin and Yang did not ellude either of the two. There's a formula for the above. If X is a positive admission, Y is a negative admission and Z is the general image you are trying to paint the person in (where Z is a magnitude of either X or Y), any X/Y granted that is not a magnitude of Z, then the opposite SHALL be grotesquely over exaggerated as to make the other negligable. That's why, not one single historian or author is willing to point out the obvious and say (for example) Hitler was a genius and leave it at that. They have to stress to childish levels of zeal that he was insane or cruel or anything to belittle the positive claim. The same exact thing is also for other icons that aren't under a negative light... Dr. Martin Luther King, Ghandi etc. It's VERY difficult, and in Europe outright illegal, to have any real partial analysis of these icons; infact, there's a man in Germany on trail becuase he does question a lot of the fabricated claims about Hitler. Winston Churchill wasn't no angel, yet how many texts are there that focus on only the bad without trying to backpeddle and counter the negative claims to preserve his image? None.

Work environments... when you're asked to document, you always have to pull stuff verbatim from sources your boss might respect. In English 101, we have to write papers with references for each and every claim. While this is an entirely different debate, on the laziness of ignorant people above you to prefer you have references rather than understand your statments to agree or disagree, this does present encouragement for the ease of the situation to simply copy and idea once you establish that someone important said it. How many ways are you going to "put in your own words" an idea of someone so much smarter or wiser than yourself? It suffices to accept the adage as profound wisdom and preserve it in all it's glory. (That's a well made remark!)

Also, not to mention, a fact does not have a poetic license! Meaning that, there are only so many ways to be sweet and direct in explaining what the Pythagorean Theorem is. Anything more is just bluff and fluff. As for a self perceived description of Ghandi, that would vary greatly from one to the other becuase no claim would be reflective of Truth and Fact, however for any real facts... "Ghandi was born on October 2, 1869"... how many other ways are you going to say that? We can use synonyms, pick up a thesaurus... we can hire someone to practically translate it to Latin and toss it translation to the reader... or, perhaps we can just go all around the world for that simple statement and then copyright a paragraph and a half all for just stating the date he was born. "Ghandi was born on October 2, 1869", must be plagiarism! If you say it outloud, you're in c

Re:Plagiarism the myth the fact. by gunny01 · 2006-11-05 18:13 · Score: 1

I agree 100%. Especially since we are more and more studying 'opinion' pieces, but their is always one right opinion. For example, recently in class, I questioned the validity of the (Australian Indigenous) Stolen Generation: whether it was appropriate to call it a stolen generation. Was my opinion valued and discussed?

Of course not. I was shot down for being a "racist bigot" who should "actual learn" and that I needed to "get a heart".

--
kill all the fucking niggers
Re:Plagiarism the myth the fact. by AtomicJake · 2006-11-06 04:38 · Score: 1

First, American education enforces plagiarism. That's right! How so? Well, take for instance the fact that almost every test in any mundane American education facility almost always encourages the student to regurgitate a canned answer from a designated source of information. It gets even worse when you enter the University level, and is unbelievably worse yet, if you enter any top tier Univeristy (where the professors themselves demand you buy *their* book).

If you study facts and principles in the university, you should learn them and it's good practice that you also study those references that describe them. In the best case: the original. That has nothing to do with plagiarism.

[lot of stuff about Hitler and Ghandi.]

This stuff about Hitler and Ghandi is complete nonsense. E.g. you can tell positive things about Hitler; however, probably nobody wants to hear it, because this is not why is is notorious. In some countries in Europe you can be prosecuted, if you deny Hitler's crimes (e.g. the Holocaust), but not by just saying that he has also done some positive things (e.g. building the Autobahn) [Disclaimer: I do not think that Hitler did one good thing; I just make the point that it is not illegal to say it.]

Also, not to mention, a fact does not have a poetic license!

Correct. And you completely confuse plagiarism with copyright or other laws. You are not allowed to copy verbatim without proper attribution to the sources. That's all. You should have learned that in school (and not plagiarism...).

What Brandt _should_ do, rather than crowing by Howzer · 2006-11-05 15:54 · Score: 2, Insightful

Is release the script or code that he used to generate his 142 plagiarised articles out of 12,000.

Such a script, if tuned and more widely applied, could be extraordinarily useful in weeding out future instances of plagiarism.

142 articles flagged, 142 articles fixed within hours. That's Wikipedia working as no dead-tree encyclopedia can.

Of course, Brandt would never do anything as useful as that, but will probably content himself with continuing to "shoot from the hip" and claim this as a blow against the Wikipedia community, rather than a bravura demonstration of exactly how well it works.

Re:What Brandt _should_ do, rather than crowing by Howzer · 2006-11-10 21:04 · Score: 1

Breaking news: Fourteen people were killed today by a rogue Britannica article. When asked what they were going to do about the fatally incorrect information in their popular dead-tree product, Britannica managers stated "Well, we just hope that no-one else finds that article! I mean, we haven't got a new edition out until the year after next."

Independant investigators have looked at other Britannica articles edited by the same "expert" -- a task that was complicated by Britannica's non-transparent editorial schema -- and say there may be a systematic problem. There may be other fatal articles in the 28-volume set.

No-one seems to be willing to put a figure on how many more people might be killed before these rogue articles are changed "year after next."

Hello, Brandt. If you're going to post Anonymous Coward at least _make sense_.

Re:1% plagarism! by nbauman · 2006-11-05 17:07 · Score: 1

Yeah, but where have you ever heard of a journal article composed of 1% plagiarism subject to law suits, apologies or ostracism? I can't think of any.

I used to catch Newsweek plagarizing from the Wall Street Journal and the Village Voice. I wrote them letters challenging it. They claimed they got the same quotes independently, which was obvious bullshit. I remember walking into a newspaper office and seeing a guy rewriting an article from the New York Times. Trade magazines use quotes from the WSJ and NYT all the time. It happens all the time. I've never heard of them being sued. Can you cite a verifiable source?

148 : 12000 by Max+Threshold · 2006-11-05 17:19 · Score: 1

Only 148 articles out of over 12,000? That doesn't sound very newsworthy. Plus, no mention was made of checking those 148 for contributions from the original authors of works found elsewhere on the Web.

Re:1% plagarism! by nbauman · 2006-11-05 18:16 · Score: 1

I can recall one case, reported in I think Science magazine, of a PhD student whose native language was not English. He submitted a thesis in which he had copied entire passages from other works. Somebody took offense at that, and tried to bring some kind of academic charges against him -- no lawsuit was involved. The PhD student said that it was an honest mistake, because he wasn't familiar with the style of attribution, and besides, his supervisor had approved. Furthermore, his defenders claimed that all PhD theses in this field copied heavily from other work to give the background of the research (where else are you going to get the background?) and the only difference was the degree to which he had remained faithful to his sources. The only original work in these papers was the report of the original research.

You must be new around wikipedia by Project2501a · 2006-11-05 19:31 · Score: 1

It has been decided by whomever has made the most cognizant argument regarding the proper interpretation of evidence provided.
Clearly, you have never edited any of the wiki articles on evolution, the balkans or the palestinian/israeli conflict.

--
----

Re:1% plagarism! by Macthorpe · 2006-11-05 20:43 · Score: 1

You had that response all planned out didn't you?

It was made obvious by the fact that he didn't accuse you of anything.

--
"It does not do to leave a live dragon out of your calculations, if you live near him." - Tolkien

211 of 267 comments (clear)