Slashdot Mirror


Hoax-Detecting Software Spots Fake Papers

sciencehabit writes: In 2005, three computer science Ph.D. students at the Massachusetts Institute of Technology created a program to generate nonsensical computer science research papers. The goal was "to expose the lack of peer review at low-quality conferences that essentially scam researchers with publication and conference fees." The program — dubbed SCIgen — soon found users across the globe, and before long its automatically generated creations were being accepted by scientific conferences and published in purportedly peer-reviewed journals. But SCIgen may have finally met its match. Academic publisher Springer this week is releasing SciDetect, an open-source program to automatically detect automatically generated papers. SCIgen uses a "context-free grammar" to create word salad that looks like reasonable text from a distance but is easily spotted as nonsense by a human reader.

3 of 61 comments (clear)

  1. Evil tech? by Anonymous Coward · · Score: 5, Interesting

    The purpose of the scam papers was to expose scam journals.
    The purpose of this new software seems to be to all scam journals to continue scammng.
    So it's an evil software, that should not have been developed, right?

    I mean, if you were doing actual peer review, none of this would pass even a half-sentient peer's inpection.

  2. It is too much trouble to fix the problem by Attila+Dimedici · · Score: 4, Interesting

    Springer reveals that they are not interested in fixing the problem revealed by SCIgen, they just want to prevent that software from demonstrating that they have not fixed it. They aren't going to change the review process to ensure that they no longer publish papers which are nonsense. No, they developed software to eliminate those papers which were generated by other software.

    --
    The truth is that all men having power ought to be mistrusted. James Madison
  3. Re:Results? by phantomfive · · Score: 3, Interesting

    Of all the problems you might find at arXiv, I don't think "auto-generated papers going undetected" is one of their problems.

    ArXiv's problem is recognizing when human-written, realistic sounding papers are actually BS.

    --
    "First they came for the slanderers and i said nothing."