Study of Massive Preprint Archive Hints At the Geography of Plagiarism
sciencehabit writes with this excerpt from Science Insider: New analyses of the hundreds of thousands of technical manuscripts submitted to arXiv, the repository of digital preprint articles, are offering some intriguing insights into the consequences — and geography — of scientific plagiarism. It appears that copying text from other papers is more common in some nations than others, but the outcome is generally the same for authors who copy extensively: Their papers don't get cited much.
The system attempts to rule out certain kinds of innocent copying: "It's a fairly sophisticated machine learning logistic classifier," says arXiv founder Paul Ginsparg, a physicist at Cornell University. "It has special ways of detecting block quotes, italicized text, text in quotation marks, as well statements of mathematical theorems, to avoid false positives."
The game is to find a unique angle to approach your research that's essentially clickbait, then produce some results, and figure out some way you can claim victory and go home.
If you're just doing this to get on to the next stage, it makes sense to plagiarize and get it out of the way. You can get to the nice fat yearly income that way without having to know much of anything.
Do we have a quality of scientists problem because science is such an esteemed (and often well-paid, in private practice at least) career that people who should not be scientists are trying to be scientists?
Futurist Traditionalism
Why does anyone need 'credit' for ideas? I think that recognizing a good idea is more important than boiling down credit to a single group or entity.
We need to divest ourselves from the mind virus that intellectual property. The idea of credit is just another lump on that intellectual property turd.
Anglo-Saxon types might be quick to say, "Oh look what a surprise more plagiarism in 2nd and 3rd worlds!" but there is something very fucking serious going on if five per cent of submissions involved in plagiarism... and I'm going to conjecture that the "very fucking serious" thing going on is a technocratic one: the plagiarism detection software is inadequate - a technology I have found to be universally shit wherever used - but probably received more testing from people familiar with European languages.
For example, more than 20% (38 of 186) of authors who submitted papers from Bulgaria were flagged, more than eight times the proportion from New Zealand (five of 207). In Japan, about 6% (269 of 4759) of submitting authors were flagged, compared with over 15% (164 out of 1054) from Iran.
I suspect that the ratio in countries where the motivation could -literally- be publish or perish, will be consistently higher than those where the saying is figurative.
~ Whence do you come, slayer of men, or where are you going, conqueror of space?
I am wondering how many of the people who are flagged as plagiarizing in countries with a low rate, if they are originally from countries with a higher rates.
If something is so important that you feel the need to post it on the internet... It probably isn't that important.
it was printed in newsweek 17 years ago, verbatim.
And, I have found that copying text from other papers is more common in some nations than others, but the outcome is generally the same for authors who copy extensively: Their papers don't get cited much.
about one in 16 arXiv authors were found to have copied long phrases and sentences from their own previously published work
OK, sometimes quoting your own work may be legit, but this sounds more like simple boilerplate cut and paste
"copying text from other papers is more common in some nations than others"
I presume they mean "plagiarism is more common in some nations than others", but hey, they are talking to Americans here...
Let me guess which nations had the most dishonest people in them. India, perchance? China? We can leave Africa out (I know it's a continent), they don't 'do' tests there...LOL.
Joe Biden, serial plagiarist.
the most massively plagiarized paper in the history of the universe.
I don't have there whole data, but they did put up 10 or so on their nice little map. Seems more like the fewer papers a country has the higher the percentage of plagiarism. However, the US has so many papers in this study it should be divided into smaller regions.
I wonder how much these disparities are due to western researchers knowing how to game the system. Some 10 years ago I received a warning related to "self-plagiarism" because I had copied the definition of a problem from one of my previous papers (one column, the rest of the paper was completely new). Since then, I know I have to change the text of the problem definition between two papers, even if it is the same. In the meantime, I have seen people submit the same work to two different conferences after changing just the wording of the papers (or the presentation), and not being charged with plagiarism (especially if they are well-established in the field). Actual plagiarism I have only seen in one paper with chinese authors. So, presuming most plagiarism is in fact self-plagiarism I wonder how pertinent the results are.
Eurocentric, every last one of 'em
If you're going to plagiarize, don't upload your paper to arXiv.
"First they came for the slanderers and i said nothing."
I'm too lazy to do the math but I bet there's a correlation w/Transparency International's Corruption Index. Causation is also an exercise left for the reader but I'd guess it's a cultural thing.
I could not understand if the politcal map represents the past or the future, as a bunch of countries are shown to be a part of Russia.
Some countries place a high premium on memorizing and repeating back the teacher's words. These countries still produce their share of good and bad engineers, but they're sometimes bad in unrecognizable ways.
I once hired a software engineer from a third world country who had an encyclopedic knowledge of design patterns. You could name any pattern in the GoF *Design Patterns* book and he could reel off the UML without hesitation and give a convincing sounding explanation of how the pattern worked. But when I started inspecting his code, I quickly realized he had no understanding of what any of it meant. It was just pictures and words he'd memorized, an impressive and prodigious feat, but ultimately useless to me.
Now I should say I've hired some very good software engineers from this country; it's not that they don't make good engineers over there. For most people the discipline to absorb a lot of information yields many benefits. But this guy was an outlier; he managed to get a master's degree over there in a subject he had no practical understanding of whatsoever.
Post may contain irony: discontinue use if experiencing mood swings, nausea or elevated blood pressure.
Breakdown by region doesn't mean anything. IMHE, India should be a top offender. Maybe the fails in the US are all from native Indians and Bulgarians....we don't know.
I saw this same exact post over on reddit yesterday, but it was posted by a different user ...
This..
You either write a paper which can be understood on its own (describing the instrument and spacecraft in summary) or you have to reference some other paper, in some other journal, which may not be as readily available. A lot of the really good "here's how the spacecraft works, it's mission plan, all the instruments" papers get presented at conferences by the engineers who developed the equipment and ran the mission. And conference papers are a lot harder to come by than journal papers for a variety of reasons.
Behind all this is that the employers and evaluators of those Engineers are not as driven by "peer-reviewed journal pubs" as they are by "delivered on time and within budget" and they're really not wild about "I need to be kept on the charge number for 2 years while I shepherd the paper through the review process at the journal". Make no mistake, in modern business, there is no "spare time on nights and weekends" to get your papers published. Nope, your night and weekend time is going to be spent trying to deliver on time and within budget for the next spacecraft. The manager of the engineer delivering that astronomy instrument doesn't care a rodent's fuzzy behind whether a paper is ever published.
So, if you're a scientist publishing that paper about your new findings, you'd better put a canned description of the instrument and mission in your paper.
To quote the article "It shows only the incidence of flagged authors for the 57 nations with at least 100 submitted papers, to minimize distortion from small sample sizes." If a country has a total number of papers in the hundreds it implies the number of authors is also low. Therefor, a small number of authors who routinely plagiarize can have a major effect.
It's analogous to a small town with a very low crime rate. All it takes is a few significant incidences to cause a huge jump in the statistics.
For comparison, it would be interesting to see the rates for other kinds of text reuse. From the article:
For comparison it would be useful to see the percentage of this reuse displayed on another map. I have a strong suspicion that countries that look good on the presented map would not look nearly as good by this measure.
Why is Snark Required?