Software Finds Plagiarism In Research

← Back to Stories (view on slashdot.org)

Software Finds Plagiarism In Research

Posted by CmdrTaco on Wednesday October 27, 2010 @02:37AM from the grant-revoked dept.

shmG writes "Researchers from the Virginia Bioinformatics Institute have created a seek-and-destroy program — for plagiarism. Called ET Blast, it's designed to find plagiarism in scientific papers. It does a full-text analysis, and then looks for similar publications in several databases. 'We have better literature,' Garner said. 'There are abstracts and full papers, and a database called Crisp, where you compare stuff to every grant the NIH gets. It's compared to any research that's been funded.'"

5 of 111 comments (clear)

Min score:

Reason:

Sort:

Re:What about ... by notgm · 2010-10-27 02:42 · Score: 4, Insightful

if you resubmit your own work, it's not plagiarism.
There's nothing about "destroying" in the article by Zontar_Thing_From_Ve · 2010-10-27 02:52 · Score: 2, Insightful

I can't blame the submitter for this one. The article itself uses the term "search and destroy" early on, yet says absolutely nothing about destroying anything.
plagiarism differs in science vs. English Lit. by onionman · 2010-10-27 02:55 · Score: 5, Insightful

I once had an English teacher who said, "If you have more than five consecutive words matching a source, without a citation then it's plagiarism." Perhaps that's how freshman writing assignments are graded, but it's silly when applied to scientific papers. Pick up any math paper on number theory, and you're bound to find the sentence "Let p be an odd prime number." without citation, but that would hardly qualify as plagiarism. Yet, syntactic matching appears to be exactly what this program is doing.
What constitutes "plagiarism" in a scientific paper is very different from plagiarism in journalism or English literature. In scientific writing, it is expected that authors will use the same flat, impersonal style and repeat definitions and the results of others to save the reader the time of having to look them up. So, simple pattern matching between science papers will result in a great many false positives. In science (and math) writing what matters is the new result which the author is claiming. It seems to me that it would be nearly impossible for a computer program to detect the distinction.
1. Re:plagiarism differs in science vs. English Lit. by pz · 2010-10-27 03:16 · Score: 3, Insightful
  
  Furthremore, when a scientist has spent a number of years on a long-term research plan, the condensed versions of what he is studying become so well rehearsed that it gets memorized. I have stock phrases that I use when I want to describe this or that aspect of my work because, after giving dozens of presentations about it, they are the ones that work best. They are the most highly polished and refined. They communicate the idea well. And so, they often get trotted out with every manuscript or grant application. My students and post-docs learn to use the same phrasing because, flatly, it works.
  None of the instances of those phrases or full sentences require attribution because they are all from the same motherspring of thought. We are the writers. And, as you might imagine, this might well produce a raft of false positives to a system that blindly compares text.
  
  --
  
  Put my fist through my alarm clock with its ding-dong death inside my ear. - The Blackjacks.
2. Re:plagiarism differs in science vs. English Lit. by Oxford_Comma_Lover · 2010-10-27 03:39 · Score: 4, Insightful
  
  > Perhaps so, but I could see where such a rule could come from, and it could instill a discipline of making sure things are properly cited. Without any other context, obviously the rule is rubbish, but I could see it as an excellent rule to live by when taking freshman courses in writing/composition.
  But that's half the problem. The rule may come from a desire to instill discipline, but it's just a bad rule, because it teaches that plagiarism of ideas isn't plagiarism at all, and that stringing five words together in a way that's been used before is, and that rewriting something in your own words makes it no longer plagiarism.
  Demand students live by a childish rule, and you will at best be someone they have to ignore as they try to actually learn things.
  
  --
  -- IANAL, this isn't legal advice, and definitely isn't legal advice for you. Also, Squee!