Bloggers are the New Plagiarism
mjeppsen writes "PlagiarismToday offers a thought-provoking article that frankly discusses concerns with plagiarism and rote content theft among bloggers. In the section entitled "Block quotes by the Dozen" the author mentions the so-called "gray area". That is PlagiarismToday's classification of the common blogger practice of re-using large blocks of text/content from the original article or source, even when the source is attributed."
even when the source is attributed.
Its not plagiarism then is it?
There are shills on slashdot. Apparently, I'm one of them.
I've seen the results of this study before somewhere...
The opposite of progress is congress
I hope somebody has quickly plagiarised their article because their server appears to be already slashdotted.
Not that it's Slashdotted or anything, I just thought it'd be funny.
---
The Investor Relations Web Report calls it "the new plagiarism". Dan Zarella from Puritan City call those who engage in it "the best plagiarists". Others simply call them bloggers or, as Zarella also put it, "Human Aggregators".
They're a new breed of content users that walk a gray area between that which is clearly fair use and what is obviously content theft. Their blogs are marked with large swaths of block quotes and heavy content reuse, but also proper attribution and at least some original content.
These sites, as they've grown in number, have created a great deal of controversy among bloggers who are left to wonder if they are nothing more than content thieves in disguise.
Block quotes by the Dozen
These sites, which for this article I'll simply call "gray", are generally identified by a large number of very short posts, with much of it in block quotes or otherwise directly lifted content. Though they meticulously credit their sources, bowing to more traditional rules for blog attribution, and work to add at least some original content, usually over half of their material comes from other sources.
This has caused many bloggers to worry that these grey blogs might be trying to get away with content theft under the guise of legitimate attribution. The idea being that they can create a much larger volume of content if they only have to write a small portion of it. Users will simply visit the gray blogs since they are able to provide so much more information and, due to the use of liberal quoting, the user will then have no reason to visit the original source. After all, they already have most of the critical information.
While certainly grey blogs don't pose the same threat or raise the same concerns as spam blogs and other content scrapers, the cause for concern is clear. Even though blogging is about sharing and reusing information, excessive sharing threatens the authors penning the original content. The tale of the goose laying the golden egg springs to mind as, quite simply, greed can be the blogging world's biggest enemy.
A Separation of Degrees
What makes this issue so difficult to address, and so difficult to write about, is that it's not so much about gray blogs, but rather, various shades of grey blogs. The difference between someone simply quoting blogs and someone trying to tweak the system is not a clear cut matter, but a separation of degrees.
Quoting, even liberal quoting, is expected by blogs. It's a part of researching a story and covering ongoing stories as well as sharing information. If done properly, it can not only be used to create a new work, but also drive valuable traffic to the original site. In the blogging world, being the source is often a badge of honor.
However, basing your entire site, or even a larger percentage of it, on quoted content is viewed differently. Being a source in a larger article is one thing, but having your content be the majority of the article on another site another. What distinguishes one from the other is unclear at best. There are no math formulas or systems for determining what is right or what is too much.
More confusing still, everyone has a different idea of what constitutes content theft. With Creative Commons Licenses being very common, it's obvious some feel that copying an entire work is acceptable so long as attribution is affixed. Others would place the boundary well within what is usually considered fair use.
The challenge becomes to strike a balance and set some kind of guideline that is compatible with copyright law, acceptable under the current code of blogging ethics but also able to appease the concerns many bloggers share over grey sites.
A Proposed Solution
When I first looked at the problem, I was tempted to set guidelines by which a blogger should not get more than X percent of their overall content from other sites or use more than Y lines from another entry.
For example:
Sometimes the block of text is preceeded by "from the article:", but half the time, it is presented as comments from the story submitter, and the Story Approvers (I refuse to call them editors) do absolutely squat to correct it.
Please help metamoderate.
Given the volatile nature of the web today, there's an excellent chance that the page you link to today will be gone 6 months from now. If you want your post to have any value in the future, it needs to be more than just "Hey, look here!" (Although except in the case of the shortest source articles, copy+pasting the entire page is bad form.)
Of course, for your post to have any value today, just quoting isn't enough. At that point, it may as well be a link. You have to provide some commentary, maybe your opinion, maybe additional information, or maybe you're just using the quote as a springboard to go off on your own topic.
It comes down to a balance: are the quotes there to support and/or provide context for your own words? Are they there as a summary so that someone wandering by a year from now knows what people are talking about? Or is it little more than an unauthorized mirror?
"But there are 2 ways to do it: Summing up the content and providing a link, or ripping a few lines out of context and then mentioning in the fine print where they're from. ...
So while I'm all for gathering info and making it available to your readers, I'm also very much against the "Readers Digest" approach: Snipping out what I deem valuable, copying it to my page and giving half-hearted credit to the real author. Linking is cool. Copy-paste-blogging is just lame."
Yes, some bloggers do the equivalent of e-mail threads where they copy an entire piece, blockquote it and then add one or two sentences additionally. That's stupid.
But there are reasons to quote extensively from materials provided you're offering extensive commentary in return (and giving the proper credit up front to the author you're quoting from).
1. Summing up the content is not always that easy to do. I've seen plenty of mainstream media reports where the two paragraph summary completely misrepresents what was actually said. Where possible, I try to quote as extensively as possible precisely to avoid the appearance of mischaracterizing someone's argument.
2. Linking is great but my experience in about 10 years of writing for my own web site is that about 80% of the things you link to will be 404 within two years. Not to mention sites like the BBC's where if you go back to a story a couple years later it will likely have been completely rewritten without any sort of notice that changes were made post-publication to the text.
Parent is correct - plagiarism is claiming as your original work, someone else's work. If you attribute the work, it is clearly not plagiarism, and not a 'gray area'. The only 'gray area', I would say, would be copyright violation. It is fair use to quote someone else. But, at what point of copying large blocks of someone else's copyrighted material do you cross the line from fair use to copyright infringement?
Personally, I would err on the side of fair use - particularly if the bloggers are adding significant amounts of criticism/commentary (for example, Groklaw recently commented on the blog of some ZDNet analyst, and PJ included almost the entire text of the blog entry - but that is because she was doing a point by point rebuttal of his tripe - that should be considered fair use, because it's almost impossible to rebut in entirety, if you cannot quote in entirety). If they copy 5 pages of article text and add a 3 line summary/critique at the top, that, to me, would not be fair use.
First site:
http://www.boingboing.net/2005/05/19/cuba_switchin g_to_gn.html
Which leads me to: http://linux.slashdot.org/ And the only link out of those that's still up is http://www.theinquirer.net/?article=23300, which contains only: So all this plagiarised summarisation bullshit leads me only to http://news.yahoo.com/s/afp/20050517/tc_afp/cubacAnd before I know it, 15 minutes are gone and all I've learned is that 1500 computers have been switched. Thank you plagiarism. And the beatiful irony of it all is that I'm contributing to it with this post!