Slashdot Mirror


Competition Seeks Best Approaches To Detecting Plagiarism

marpot writes "Does your school/university check your homeworks/theses for plagiarism? Nowadays, probably Yes, but are they doing it properly? Little is known about plagiarism detection accuracy, which is why we conduct a competition on plagiarism detection, sponsored by Yahoo! We have set up a corpus of artificial plagiarism which contains plagiarism with varying degrees of obfuscation, and translation plagiarism from Spanish or German source documents. A random plagiarist was employed who attempts to obfuscate his plagiarism with random sequences of text operations, e.g., shuffling, deleting, inserting, or replacing a word. Translated plagiarism is created using machine translation."

4 of 289 comments (clear)

  1. Re:Defeat Plagerism by gnick · · Score: 5, Funny

    Simply using words would not constitute plagiarism. You just can't allow students to use words that somebody else has used before.

    For more information of this technique, please read my recent paper, Clickous Verandim Redundo Berata Quizzomandus.

    --
    He's getting rather old, but he's a good mouse.
  2. Require submission of drafts; meet with students by cpu_fusion · · Score: 5, Interesting

    Plagiarism is a symptom of professors only being involved in the last step: reviewing the final product.

    Require the students to submit multiple drafts. Meet with them for 15 minutes each and discuss their thought processes on the ongoing paper. You'll get better final products, teach people not to procrastinate, and smoke-out people who have no involvement in their "own work."

    What, can't do that because you have 60 students in a class? Well, there's part of the problem too.

    We're trying to find a technology solution to a problem with less student-teacher interaction. Typical!

  3. The humanities are in trouble. by Areyoukiddingme · · Score: 5, Interesting

    Seriously, the humanities are in trouble. With over 6 billion people on the planet, it's extremely difficult to have an original thought. This sets the stage for endless repetition. Add to that the fact that the very process of teaching the humanities usually means imparting a teacher's single interpretation of the source material to the students who then do the natural thing when it comes to writing a paper and parrot back to the teacher what they've heard, knowing that's the only way to get a good grade, and the resulting combination is deadly.

    The papers are all going to be similar from the beginning, because it's a rare instructor who actually encourages dissenting opinions (and that fault in teaching is a whole other discussion of its own). Then the papers are going to be similar because there really are only so many ways to interpret the source material that are defensible. And finally, the papers are heavily likely to be similar to at least one other paper written about the subject, when every paper ever written on the subject is considered (exactly what the plagiarism sites attempt to do).

    I think the problem this competition is trying to solve is intractable in the face of the current educational system. It's gotten to the point where, if the software considers a large enough number of sources, even the instructor's own papers are going to look like plagiarism.

    Hell, look at the Slashdot comment system. A million people read the front page, but only a few thousand post comments. Thousands more are content to simply moderate the comments, and face it, comments they agree with are more likely to be modded up, one way or another. Then compare the modded comments. We get a lot of duplicate or near duplicate thought, and hence near duplicate comments on every article. Why? Because when you get enough people together in one place, discussing the same subject in writing, there are only so many viewpoints and only so many comments that won't get modded down for being of the "cubic what?" variety.

    Time to go back to grading on spelling and grammar. We've reached the end of the grading on ideas road. Coherency of presentation is all we have left. (One could argue it's all we ever had.)

  4. Who needs plagiarism? by Ralph+Spoilsport · · Score: 5, Insightful
    When you've got Markov Generators?

    And the Postmodernism Generator?

    You don't have to write much of anything at all. Would you get a good grade? Fuck no. Would they FLUNK YOU FOR IT? Fuck no. Because its graded by untenured faculty who have to curry favour with students, or its graded by Grad Assistants who don't give a shit, and why should they.

    Oh, look, a paper by Cindy Bleethstain. She's a fucking idiot. Let's see. Hmmmm. Yup. Incomprehensible bullshit, as usual. Give her a C+ because some of it is intelligible and kind of funny.

    Oh, look another paper by Guido LeDouchebag. Bottlecaps are smarter than this turnip. Hmmm. Yup. More incomprehensible bullshit. C+. At least he finally discovered the spellchecker.

    THAT'S what it is often like, unfortunately.

    I read the paper, and if there is a passage that is noticeably different in tone, I'll copy past a section into Google and see where they pulled it. 9 times out of 10, it's a direct lift from a web page, unattributed. I send it back, and tell them "Footnotes, please. Also, automatic single grade loss. right off the top."

    If it comes back still broken, then I nail 'em for plagiarism. It's a big deal, and requires paperwork I don't like to fill out...

    So far I've only had one student have the cajones to not bother fixing their attributions, and he got crucified by the Ethics board. He was an arrogant little prick, too.

    RS

    --
    Shoes for Industry. Shoes for the Dead.