Bloggers are the New Plagiarism
mjeppsen writes "PlagiarismToday offers a thought-provoking article that frankly discusses concerns with plagiarism and rote content theft among bloggers. In the section entitled "Block quotes by the Dozen" the author mentions the so-called "gray area". That is PlagiarismToday's classification of the common blogger practice of re-using large blocks of text/content from the original article or source, even when the source is attributed."
even when the source is attributed.
Its not plagiarism then is it?
There are shills on slashdot. Apparently, I'm one of them.
I've seen the results of this study before somewhere...
The opposite of progress is congress
I hope somebody has quickly plagiarised their article because their server appears to be already slashdotted.
I agree, it is easy to copy and paste, and with the proliferation of blogs, on-line stories, etc., realizing and detecting inversely proportionately becomes harder.
What makes this issue so difficult to address, and so difficult to write about, is that it's not so much about gray blogs, but rather, various shades of grey blogs. The difference between someone simply quoting blogs and someone trying to tweak the system is not a clear cut matter, but a separation of degrees.
Quoting, even liberal quoting, is expected by blogs. It's a part of researching a story and covering ongoing stories as well as sharing information. If done properly, it can not only be used to create a new work, but also drive valuable traffic to the original site. In the blogging world, being the source is often a badge of honor.
"Waste not one watt!" - CZ
Not that it's Slashdotted or anything, I just thought it'd be funny.
---
The Investor Relations Web Report calls it "the new plagiarism". Dan Zarella from Puritan City call those who engage in it "the best plagiarists". Others simply call them bloggers or, as Zarella also put it, "Human Aggregators".
They're a new breed of content users that walk a gray area between that which is clearly fair use and what is obviously content theft. Their blogs are marked with large swaths of block quotes and heavy content reuse, but also proper attribution and at least some original content.
These sites, as they've grown in number, have created a great deal of controversy among bloggers who are left to wonder if they are nothing more than content thieves in disguise.
Block quotes by the Dozen
These sites, which for this article I'll simply call "gray", are generally identified by a large number of very short posts, with much of it in block quotes or otherwise directly lifted content. Though they meticulously credit their sources, bowing to more traditional rules for blog attribution, and work to add at least some original content, usually over half of their material comes from other sources.
This has caused many bloggers to worry that these grey blogs might be trying to get away with content theft under the guise of legitimate attribution. The idea being that they can create a much larger volume of content if they only have to write a small portion of it. Users will simply visit the gray blogs since they are able to provide so much more information and, due to the use of liberal quoting, the user will then have no reason to visit the original source. After all, they already have most of the critical information.
While certainly grey blogs don't pose the same threat or raise the same concerns as spam blogs and other content scrapers, the cause for concern is clear. Even though blogging is about sharing and reusing information, excessive sharing threatens the authors penning the original content. The tale of the goose laying the golden egg springs to mind as, quite simply, greed can be the blogging world's biggest enemy.
A Separation of Degrees
What makes this issue so difficult to address, and so difficult to write about, is that it's not so much about gray blogs, but rather, various shades of grey blogs. The difference between someone simply quoting blogs and someone trying to tweak the system is not a clear cut matter, but a separation of degrees.
Quoting, even liberal quoting, is expected by blogs. It's a part of researching a story and covering ongoing stories as well as sharing information. If done properly, it can not only be used to create a new work, but also drive valuable traffic to the original site. In the blogging world, being the source is often a badge of honor.
However, basing your entire site, or even a larger percentage of it, on quoted content is viewed differently. Being a source in a larger article is one thing, but having your content be the majority of the article on another site another. What distinguishes one from the other is unclear at best. There are no math formulas or systems for determining what is right or what is too much.
More confusing still, everyone has a different idea of what constitutes content theft. With Creative Commons Licenses being very common, it's obvious some feel that copying an entire work is acceptable so long as attribution is affixed. Others would place the boundary well within what is usually considered fair use.
The challenge becomes to strike a balance and set some kind of guideline that is compatible with copyright law, acceptable under the current code of blogging ethics but also able to appease the concerns many bloggers share over grey sites.
A Proposed Solution
When I first looked at the problem, I was tempted to set guidelines by which a blogger should not get more than X percent of their overall content from other sites or use more than Y lines from another entry.
Nobody can read the whole internet. Nobody. So what people do is they rely on others to pick the interesting pieces worth reading and go from there.
But there are 2 ways to do it: Summing up the content and providing a link, or ripping a few lines out of context and then mentioning in the fine print where they're from.
While the first is something I do agree with, the second stinks of "I don't have content but I want visitors, but if I hand out my sources my visitors might go there instead of to me."
So while I'm all for gathering info and making it available to your readers, I'm also very much against the "Readers Digest" approach: Snipping out what I deem valuable, copying it to my page and giving half-hearted credit to the real author. Linking is cool. Copy-paste-blogging is just lame.
And I'd really wish this message could be sent to those who do it just that way.
We used to have a Bill of Rights. Now, with the rights gone, all we have left is the bill.
For example:
Sometimes the block of text is preceeded by "from the article:", but half the time, it is presented as comments from the story submitter, and the Story Approvers (I refuse to call them editors) do absolutely squat to correct it.
Please help metamoderate.
Given the volatile nature of the web today, there's an excellent chance that the page you link to today will be gone 6 months from now. If you want your post to have any value in the future, it needs to be more than just "Hey, look here!" (Although except in the case of the shortest source articles, copy+pasting the entire page is bad form.)
Of course, for your post to have any value today, just quoting isn't enough. At that point, it may as well be a link. You have to provide some commentary, maybe your opinion, maybe additional information, or maybe you're just using the quote as a springboard to go off on your own topic.
It comes down to a balance: are the quotes there to support and/or provide context for your own words? Are they there as a summary so that someone wandering by a year from now knows what people are talking about? Or is it little more than an unauthorized mirror?
I thought the proposed "solution" in the article was just stupid. The idea that somehow the law should police millions of blogs by applying some kind of complex formula to determine if they are in the wrong is just not feasable. Even if blogs are the worst source of plagerism there is really nothing that can be done about it, except raise public awareness.
Philosophy.
Parent is correct - plagiarism is claiming as your original work, someone else's work. If you attribute the work, it is clearly not plagiarism, and not a 'gray area'. The only 'gray area', I would say, would be copyright violation. It is fair use to quote someone else. But, at what point of copying large blocks of someone else's copyrighted material do you cross the line from fair use to copyright infringement?
Personally, I would err on the side of fair use - particularly if the bloggers are adding significant amounts of criticism/commentary (for example, Groklaw recently commented on the blog of some ZDNet analyst, and PJ included almost the entire text of the blog entry - but that is because she was doing a point by point rebuttal of his tripe - that should be considered fair use, because it's almost impossible to rebut in entirety, if you cannot quote in entirety). If they copy 5 pages of article text and add a 3 line summary/critique at the top, that, to me, would not be fair use.
even when the source is attributed.
Its not plagiarism then is it?
- Whiney Mac Fanboy
(If you get the joke, you'll mod this up)
Oh no... it's the future.
First site:
http://www.boingboing.net/2005/05/19/cuba_switchin g_to_gn.html
Which leads me to: http://linux.slashdot.org/ And the only link out of those that's still up is http://www.theinquirer.net/?article=23300, which contains only: So all this plagiarised summarisation bullshit leads me only to http://news.yahoo.com/s/afp/20050517/tc_afp/cubacAnd before I know it, 15 minutes are gone and all I've learned is that 1500 computers have been switched. Thank you plagiarism. And the beatiful irony of it all is that I'm contributing to it with this post!
"How Opal Mehta Plagarised, Got Busted, and Got Kicked out of Harvard"
I'm an anti-copyright advocate who sees more power in releasing my information for free to the ether of the Internet. Not only do I not copyright my blog posts, e-books and music, I openly request others to copy it and even put their own name on it. I've realized that once I put something into easily copied form, it will be copied. It might be partially used, fully mimiced, or completely turned upside down, yet I've also found that the more I am copied, the more people tend to find out that I am the original author. For me as a writer, I love to know that people are reading me and replying to me -- that is my "profit" in the short term -- reader input. I tend to make up my own words that I write with, in order to see who might be copying me fully. I then look at what people say about their "writings", too. One such word I created was unanimocracy, but I've invented a few other phrases that are easily searched, too. I believe the best way to "fix" plagiarism isn't to make it more illegal or immoral, but to work on a free market and open system where content creators can submit their creations to be cataloged as "the first." Let others copy it, but Google or another toolbar can easily flag a new creation as "very similar to another." Imagine if the Google toolbar had a "% of originality" for every site you visit (or every paragraph to highlight with your mouse). This could work for lyrics, guitar tabs, writings, opinion, news articles, etc. Plagiarism is "OK" is some circles -- do a Google News search and see how many big named media outlets just regurgitate each others' news. Boring. Bloggers do the same thing, but many put a unique spin on the original writer's ideas. I love when people plagiarize me. In the long run it builds my credibility even if they don't reference me as the original writer. I'd rather find free market solutions (such as the one I outlined above) rather than find penalties for the copying. If someone discovers that the person they respect didn't write the content on their own, the market fixes this by making the reader not read the plagiariser anymore. Easy solution. In the long run, trying to protect your creative works will be a losing process. I use my previous creations to gain new customers who appreciate the information that I don't share. That is the product/service I sell, and I use my years of writing to show a history of original opinion and beliefs. Anything I write for public consumption is merely a marketing tool to get people to hire me for real face-time -- I could care less if someone else found a better way to make money with my thoughts. Most of my thoughts are based on a lifetime of reading and thinking about what others say. My blog network forum is based completely on the comments of others -- I even pay my readers who give me the best comments. Their input on my writings is what gives me MORE information to sell at a higher price to those willing to pay for my knowledge. Why should I stop others from using my works to create new opinions that I can learn from?
It is all apart of the demagoguery used by both sides.
There is no "-1 offended" or "-1 you don't agree with me" mod options for a reason.
Well, I don't forsee a rash of bloggers rushing out to crib chunks of Moby Dick. And clearly, when they correctly cite their sources, it's not plagarism.
On the other hand, with the internet cash flow model being built around page views, it is clearly dishonest for a blogger to simply copy-paste someone else's content on their own site.
Someone who is actually creating their own content would be satisfied with a hyperlink...for them to be pasting huge chunks of material, suggests to me that they have a simple (and intellectually dishonest) profit motive.
On the other hand, I do like the occasional full article text post, but I think that should only be in the comments, and only where there is a link in the top-level post, which is either restricted (i.e. WSJ, NYT, AJC, etc) or Slashdotted.
Either way I think a content provider could make a solid case for copyright infringement. If I printed my own copy of someone else's book with a citation at the beginning stating that all that follows comes from this other book, then I'm clearly ripping them off.
ad logicam Claiming a proposition is false because it was presented as the conclusion of a fallacious argument.
I'm in the habit of quoting large portions of articles, or even the entire article, for a purely practical reason: the mutability of Web pages. I've lost track of how often I've made a comment about something in an article, only to have a lot of people asking what I was talking about because the article said no such thing. On looking at the article again, the passage I was referring to had either been removed or altered to say something it hadn't said originally. The only way I have to combat this is to preserve a copy of the article as I originally read it in a place not subject to editing by the article's owner.
I'd note this after-the-fact rewriting tends to be most common where the original article contained egregiously and provably incorrect statements and the authors got called on the matter and now want to never have said that (as opposed to wanting to admit they mis-stated).
I spent a few minutes trying to call up the original article so I could respond with a thoughtful statement about how the original article says it's "the new plagiarism."
And then I read your bit and realized I didn't need to. It's amazing how many people don't seem to understand that the New Something shouldn't be the Old Something because then it would just be the Old Something. Maybe the article should try and coin a new phrase for the phenomena like "polypasting" or "prolificopy" or something. That way everyone would know it's something not quite plagiarism.
The original article isn't saying "to take someone's work is plagiarism" it's saying "there's a new wrinkle in plagiarism, one in which bloggers of all kinds are block quoting chunks of material and SOMETIMES attributing, sometimes not." (not an actual quote, but from what I managed to read it's the article's premise.)
Of course, now I'm going to add my take on the situation.
Yes, the majority of the news sites get information from their AP feed and paste it. It's what happens. Do we really need CBS, ABC, NBC, Fox News, CNN, BBC, and who knows who else each over at News Point A interviewing the same three guys involved in the same story or is it sometimes better to just have the AP or Reuters write the story, give them their cut and be done with it.
Justifying wholesale theft of copyrighted works as some people because "in two years" their link might go down is indistinguishable from the people who say it's "legal" to host their collection of roms because nobody makes the original Nintendo any more (I wonder how the rom-sites will change their justification now that virtually every major game company has some retrogaming solution available, whether it's those plug-n-play tv things, Xbox Live, Gametap, Nintendo's upcoming game-download thingie) or that it would be legal to download and torrent all of CNN's content (because after all, their logo indicates it's their content, right?) because they only air their articles for a day or so at most and after that it's gone.
Websites can go away, just like books go out of print, and movies and TV shows can go out of distribution. Whether you see copyright infingement of any of this as a good thing depends on what side of the coin you're on. For every "OMG! I can't believe I almost wasn't able to get this vital information because the original website was going to delete the article" person, there's another person saying "hey, I spent time reading, researching, maybe even interviewing the players for more perspective, all so that my readers would get content I created (and possibly click on my adsense), and Jimbo stole it all, slapped a quick *not my work* label on it (and possibly a click on his adsense) and I get bubkiss."
But back to my first point, I agree, New does not equal Old.