Hoax-Detecting Software Spots Fake Papers
sciencehabit writes: In 2005, three computer science Ph.D. students at the Massachusetts Institute of Technology created a program to generate nonsensical computer science research papers. The goal was "to expose the lack of peer review at low-quality conferences that essentially scam researchers with publication and conference fees." The program — dubbed SCIgen — soon found users across the globe, and before long its automatically generated creations were being accepted by scientific conferences and published in purportedly peer-reviewed journals. But SCIgen may have finally met its match. Academic publisher Springer this week is releasing SciDetect, an open-source program to automatically detect automatically generated papers. SCIgen uses a "context-free grammar" to create word salad that looks like reasonable text from a distance but is easily spotted as nonsense by a human reader.
So? Surely, after coding this up, the first thing any scientist would do is scan, at the very least, all of arXiv, and see what comes out as fake? I mean I have seen my fair share of papers that might as well have been generated by SCIgen and the like.
Software detecting papers written by software -- in the dark.
Chicken chicken, (chicken) chicken?
https://www.improbable.com/airchives/paperair/volume12/v12i5/chicken-12-5.pdf
Left MS Windows for Linux Mint and never looked back!
Vote for Bernie in 2016!
The purpose of the scam papers was to expose scam journals.
The purpose of this new software seems to be to all scam journals to continue scammng.
So it's an evil software, that should not have been developed, right?
I mean, if you were doing actual peer review, none of this would pass even a half-sentient peer's inpection.
I propose someone actually reads the papers before publishing them.
Springer reveals that they are not interested in fixing the problem revealed by SCIgen, they just want to prevent that software from demonstrating that they have not fixed it. They aren't going to change the review process to ensure that they no longer publish papers which are nonsense. No, they developed software to eliminate those papers which were generated by other software.
The truth is that all men having power ought to be mistrusted. James Madison
arXiv is not peer reviewed. What I found interesting though was the response of the publisher: write a program to detect fake papers. Even the most simplistic peer review - i.e. reading the paper - would immediately catch these papers. If they need to write a program to catch fake papers then their peer review model is essentially worthless and frankly a journal that poor is no better, and liekly worse, than arXiv: at least arXiv doesn't pretend to have peer review.
So a program designed to write fake papers to unmask sham journals and conferences gets used to write fake papers to prop up sham degrees? Some what ironic; although in fairness to the authors of the paper writing program they never intended it to be used in such a manner. It would seem, as Springer acknowledged, that they should do a good peer review; which would eliminate the need to run paper through a hoax detector unless they started getting so many fake papers that their peer review process was overwhelmed. In that case, a first run through a program would be justified. A more subtle point in the article is that claimed publications from some countries, such as China, should be viewed with suspicion.
As a side note, the sham conference industry is interesting. I periodically get, via LinkedIn, invite stop attend an "important conference" and speak and get a "prestigious award" based on my "outstanding accomplishments and renowned expertise" in my field. Funny how, when I send them my speaking fee requirements they never get back to me nor mail me the award as I request if I am unable to make the conference.
I'm a consultant - I convert gibberish into cash-flow.
Publishing houses have 1000's of "peer reviewed" journals to print. They don't have time or actual experts to read them, that is the job of the peers that buy the journal.
I'm subscribed to a number of journals,and it's immediately obvious to me that articles are quite often written in a way that is needlessly obtuse.
I understand as a specialist myself that jargon is a necessary evil, but not to the degree that someone has trouble reading an article that is about a common practice because it's so needlessly dense.
A lot of the fundamentals are simple enough that we should be able to save the complicated language for complicated and novel stuff.
"SCIgen uses a "context-free grammar" to create word salad that looks like reasonable text from a distance"
This is great for students who have lazy professors. Write a good introduction on page 1, a good conclusion on page 52, and use SCIgen on pages 2-51.
an open-source program to automatically detect automatically generated papers.
Just wait till I bust out my Trace Buster Buster.
I wish I had a good sig, but all the good ones are copyrighted
All work and no play make Jack a dull boy. Repeat 10kx
What bothers me is that in the humanities there are whole communities and sub-disciplines in which there is barely any real peer reviewing. These are small niche areas in which everyone knows everyone and basically the whole research is based on invited contributions and papers that are not properly blind peer reviewed - they are cursorily scanned by colleagues who know who wrote the article. In such a field there are about 5-10 journals in total and the authors jump back and forth between them. Most of them are unable to publish articles in top journals of the discipline as a whole. I personally know professors who have built a whole career on the basis of quoting themselves and by doing light editorial work. I know a cross-disciplinary field of study in the humanities that is entirely dominated by two professors, all the rest are scholars of them, and each of them wrote around 40 books, always on the same topic, and all of them more or less repeating the same two pseudo-competing themes over and over.
It's pretty sad to see these people recognized as experts when at the same time in other fields there is hard work and real progress.
Fron this I read that Springer instead of promoting measures to ensure real peer-review and avoid these scam conferences, actually builds a program that helps these scam conferences. Well done.
At least they have done something to warrant their publication costs. I figured the charges were just all going to the CEO, now we see that some very small part of them went to hire a CSci intern for a few weeks.
Damn_registrars has no butt-hole. Damn_registrars has no use for a butt-hole.
The biggest source of these fake papers appears to be phd papers. And given that we're producing more phds than ever before, maybe we should reform the way we do that. Because in requiring that they actually discover or examine something new the chances are that they're going to lie about something.
If we had fewer phds maybe they wouldn't do that so much. But the issue is that there are so many papers that no one can read them. And that means trying to audit this stuff is impractical.
The solution of having robots audit the papers is interesting but for that to be really effective, I think the papers need to be optimized for that sort of scan.
Less prose for example outside of the abstract. More data, more equations, more graphs... more things an expert system could take apart.
Here is my real beef with this idea... i'm pretty sure I could cheat just as easily with this system as I could right now. I think the only thing that would change is that the people reading my work would feel less of a need to check my work and just trust the robot. But if I know how to fool the robot then I automatically win.
Robots are stupid guys. I've never met a machine I couldn't outsmart.
I've decided to stop wasting my time responding to AC trolls/sockpuppets... so if you want a response from me... login.
The existence of this tool is admitting that these papers aren't peer reviewed. Wouldn't it be simpler to just admit that and stop committing fraud?
This posting is provided 'AS IS' without warranty of any kind, implied or otherwise.
More corruption in science.
DO NOT PANIC. Nobody's using this corruption to make money.
Expect the global warming alarmists, atheist fundamentalists, leftwingwackos, rightwingwackos....basically other corrupt people who then want to run your life.