Slashdot Mirror


Software Finds Plagiarism In Research

shmG writes "Researchers from the Virginia Bioinformatics Institute have created a seek-and-destroy program — for plagiarism. Called ET Blast, it's designed to find plagiarism in scientific papers. It does a full-text analysis, and then looks for similar publications in several databases. 'We have better literature,' Garner said. 'There are abstracts and full papers, and a database called Crisp, where you compare stuff to every grant the NIH gets. It's compared to any research that's been funded.'"

16 of 111 comments (clear)

  1. What about ... by gstoddart · · Score: 3, Interesting

    What about academic "recycling".

    I remember being told a long time ago that some researchers will basically make several permutations of the same paper to submit to a bunch of different places. It's essentially the same paper, with nothing new in it, but if you can get several places to publish it, you can pad out your publications list.

    --
    Lost at C:>. Found at C.
    1. Re:What about ... by notgm · · Score: 4, Insightful

      if you resubmit your own work, it's not plagiarism.

    2. Re:What about ... by robotkid · · Score: 2, Informative

      if you resubmit your own work, it's not plagiarism.

      Let me clarify the issue for those not accustomed to the rules of scientific publishing.

      There IS a thing as self-plagarism, and it's not necessarily a minor offense. At it's core, if you submit essentially the same work to multiple venues with the intent to pass each off as an independent body of work when they are not, then there is intent to deceive and that is an ethical breach of conduct. Worst case scenario, the author list and abstract has been changed just enough that it leads others to believe this particular experiment has actually been independently confirmed and duplicated when it has not.

      Most journals require that you affirm that the same manuscript is not currently under consideration for publication in another journal and has not already been published in a highly similar form elsewhere (except maybe as a conference abstract). This is different than re-submission, where a manuscript was rejected from one publication and you are now free to send to to another venue. And then there is the copyright issue, that as authors you are not necessarily the sole copyright holder (often the journal has some claim), in which case a duplicate publication is actually a violation of the journal's copyright.

      There is also the case where one, comprehensive study is artificially split into smaller, less meaningful sub studies with the intent to pad publication counts (there was an example of a prenatal intervention study where the effects on the mothers and on the infants for the exact same study were published separately without any reference to each other, diminishing the usefulness of the study). This is now not a copyright issue but now a scientific integrity issue, presumably the medical audience of such a study could be harmed by not being told both sets of outcomes for the same study in any sort of obvious way.

      There is an excellent resource on what constitutes scientific plagarism (including self-plagarism) here: http://facpub.stjohns.edu/~roigm/plagiarism/Self%20plagiarism.html

  2. How is this different from Turnitin? by mlts · · Score: 4, Informative

    This sounds almost exactly like turnitin.com where when one uploads a paper to it, it searches almost anything it can get ahold of and will list any text in any academic journal that is copied verbatim.

    1. Re:How is this different from Turnitin? by redbeard55 · · Score: 2, Interesting

      There is no harm you have done the required work. Just because you can use your work in more than one place doesn't harm anyone. Assertions to the otherwise are ridiculous.

  3. There's nothing about "destroying" in the article by Zontar_Thing_From_Ve · · Score: 2, Insightful

    I can't blame the submitter for this one. The article itself uses the term "search and destroy" early on, yet says absolutely nothing about destroying anything.

  4. Shocking plagiarism already found by Anonymous Coward · · Score: 2, Funny

    They found a research paper on hydrogen stole 2 thirds from an existing paper on water.

  5. Re:Red faces all round then.... by noidentity · · Score: 2, Funny

    Since researchers constantly plagiarize their own work

    Is this where the author of something passes it off as his own? I agree, that's a terrible thing.

  6. plagiarism differs in science vs. English Lit. by onionman · · Score: 5, Insightful

    I once had an English teacher who said, "If you have more than five consecutive words matching a source, without a citation then it's plagiarism." Perhaps that's how freshman writing assignments are graded, but it's silly when applied to scientific papers. Pick up any math paper on number theory, and you're bound to find the sentence "Let p be an odd prime number." without citation, but that would hardly qualify as plagiarism. Yet, syntactic matching appears to be exactly what this program is doing.

    What constitutes "plagiarism" in a scientific paper is very different from plagiarism in journalism or English literature. In scientific writing, it is expected that authors will use the same flat, impersonal style and repeat definitions and the results of others to save the reader the time of having to look them up. So, simple pattern matching between science papers will result in a great many false positives. In science (and math) writing what matters is the new result which the author is claiming. It seems to me that it would be nearly impossible for a computer program to detect the distinction.

    1. Re:plagiarism differs in science vs. English Lit. by pz · · Score: 3, Insightful

      Furthremore, when a scientist has spent a number of years on a long-term research plan, the condensed versions of what he is studying become so well rehearsed that it gets memorized. I have stock phrases that I use when I want to describe this or that aspect of my work because, after giving dozens of presentations about it, they are the ones that work best. They are the most highly polished and refined. They communicate the idea well. And so, they often get trotted out with every manuscript or grant application. My students and post-docs learn to use the same phrasing because, flatly, it works.

      None of the instances of those phrases or full sentences require attribution because they are all from the same motherspring of thought. We are the writers. And, as you might imagine, this might well produce a raft of false positives to a system that blindly compares text.

      --

      Put my fist through my alarm clock with its ding-dong death inside my ear. - The Blackjacks.
    2. Re:plagiarism differs in science vs. English Lit. by Oxford_Comma_Lover · · Score: 4, Insightful

      > Perhaps so, but I could see where such a rule could come from, and it could instill a discipline of making sure things are properly cited. Without any other context, obviously the rule is rubbish, but I could see it as an excellent rule to live by when taking freshman courses in writing/composition.

      But that's half the problem. The rule may come from a desire to instill discipline, but it's just a bad rule, because it teaches that plagiarism of ideas isn't plagiarism at all, and that stringing five words together in a way that's been used before is, and that rewriting something in your own words makes it no longer plagiarism.

      Demand students live by a childish rule, and you will at best be someone they have to ignore as they try to actually learn things.

      --
      -- IANAL, this isn't legal advice, and definitely isn't legal advice for you. Also, Squee!
    3. Re:plagiarism differs in science vs. English Lit. by Idarubicin · · Score: 3, Interesting

      You seem to be infected by the IP bug.

      Fortunately for the rest of us, one cannot plagarize ideas. Reformulating a concept in your own words does not count as plagarism, nor should it.

      You seem to be infected by a different sort of IP bug.

      Plagiarism is not the same thing as copyright infringement (though it's not uncommon for the same act to involve elements of both). One can plagiarize public domain sources. One can plagiarize ideas.

      Plagiarism is what happens when a writer presents other people's work (their words or their ideas) as his own, without giving due credit to the source. Pretending that you thought of something when you're actually just copying another author's reasoning is intellectual dishonesty, and squarely within the realm of plagiarism.

      If you copy someone's words verbatim, there is an added obligation to specifically identify the copied passage by blockquoting, using quotation marks, or otherwise clearly setting off the passage from the rest of your writing. If you're just paraphrasing, there's no obligation to use quotation marks (that would be silly) but there remains a need to properly name your source (through footnotes or other means). Rewriting someone else's work in your own words is otherwise still very much plagiarism.

      --
      ~Idarubicin
  7. Re:You can't plagiarize yourself [Re:What about .. by Travelsonic · · Score: 5, Interesting

    In High School, they tried to cram the concept of "self plagiarism" down our throats - what a crock of shit... you can NOT by DEFINITION plagiarize YOUR OWN WORKS. Recycling may be lazy, may violate other ethics, but to call it plagiarism is, IMO, very intellectually dishonest of these institutions.

    --
    If you believe in privacy, and believe you have "nothing to hide" at the same time, you're a goddammed idiot
  8. Re:You can't plagiarize yourself [Re:What about .. by Anonymous Coward · · Score: 3, Informative

    I actually ran into this in grad school. When writing a tech related paper, I referenced one of my past papers on the same subject as a source. My professor made it clear I had to cite myself to avoid "self-plagiarism". I thought it quite possibly the stupidest thing I had ever heard in my life, and it was coming from a celebrated PhD at a major New England university.

  9. That's all fine and good, but... by bobdotorg · · Score: 2, Interesting

    ... can it find dupes on Slashdot?

    --
    __ Someday, but not this morning, I'll finally learn to use the preview button.
  10. Re:You can't plagiarize yourself [Re:What about .. by Anonymous Coward · · Score: 5, Interesting

    Yes, but maybe the problem is that we don't have a good terms to differentiate between appropriate reuse of one's own writing, and unnaceptable reuse.

    For instance, it's a violation of academic ethics to try to publish the exact same paper in multiple places. You're effectively trying to increase your publication count without adding anything new to the body of knowledge. It's still not plagiarism, since it's your own work, but it is unethical.

    Not citing previous work when writing a paper is also wrong, though not in the same way. It can be either an honest mistake, lazy, or downright unethical (e.g. not citing the work of someone you don't like). Not citing your own previous work in the area is similarly wrong. Not because it would be plagiarism, but because citations are vital to help others understand the context, significance, and background to the present work. So you should cite yourself when appropriate, just as you would cite others.

    And lastly, there are times where re-using your own material is absolutely acceptable. For instance when releasing a new edition of a book, it just makes sense to tweak the things that need changing. It doesn't make sense to rewrite every sentence to avoid 'plagiarizing' yourself. Similarly if you write a review article of a certain field, it just makes sense to re-use some of the text from a previous review (now outdated) that you wrote. (There may or may not be secondary copyright concerns, depending on the various contracts in place.) It isn't plagiarism, and it isn't wrong.

    Perhaps academia needs to develop terms to cleanly differentiate between these cases. Or alternately people need to be more specific when they are talking about appropriate vs. inappropriate behavior. Abusing "plagiarism" as a catch-all for "unethical publication" confuses the issue.