Scientists Don't Read the Papers They Cite
WatertonMan writes "Very interesting and sure to be controversial study that suggests most scientists don't read the papers they cite. This means that if one paper misreads a work the misreading propagates. It's a very interesting study and has big implications for science, in my opinion. New Scientist has a good overview of the work. Given that most attention to work has been in sloppy work on the experimental side (poor methadology or outright fraud) this suggests a whole other problem. A lot of the ultimate problem is that many in research are concerned more about publishing than in solving the issues they investigate. Ideally the point both in science and in academics in general is to understand the ideas. Yet those of you who've looked up footnotes realize that actually engaging the ideas of other researchers typically falls by the wayside. Often footnotes are there simply because references are needed. Engaging others works is secondary. I've always thought that the hard sciences were more immune to that effect than the humanities. I guess not."
I wouldn't either -- those things are boring! ;-)
Most of my classmates don't read the papers they write. Do we hold others to a higher standard?
You can't judge a book by the way it wears its hair.
...where no one reads the articles they cite. We are in good company!
I want to drag this out as long as possible. Bring me my protractor.
We should crush all those who make foolish mistakes, just like that guy Karl Marx says in his "Communist Manifesto" (Marx, 65)
I've also seen the case where scientists will constantly refer to their own, or their coleagues' papers. This is an easy way to increase the "cited" count of the refered paper, making one's work look more usefull, even when the citation has little or no relevance to the current topic.
The study seemed to be checking for typos in citations. Just because a scientist has copied the text of a (wrongly typed) citation does not mean s/he has not read the paper. There is no law that says someone writing a paper has to type up every citation they make from scratch.
I'm almost tempted to say that this is a side-effect of all those teachers who said 'I want at least 10 references and a 5 page paper'. At least, I can't think of any serious reason why, even if someone was just publishing fluff, they'd need to bulk up the references with irrelevant ones. The only other thing I can immediately think of is that a reference becomes somewhat standard, so they use it for something they learned and forgot where they learned it from (you can't exactly say [11], 11. Professor Ragan's Astrophysics 521 class or [12], 12. Two dozen vaguely remembered textbooks). Even then, I suppose its bad form not to find some reference with the relevant information just to prove you're not making it up (yes, pi IS 3.1415....).
Another major problem with research papers is the "dissappearance" of those who actually do properly cite their sources.
/ web_citations.html)
As many of you know, the Internet is a great research tool these days. But unfortunately, it's too dynamic for the research world. "Most URL references [stand] more than a 50 percent chance of not existing after only six months." (from a Cornell study at http://www.news.cornell.edu/chronicle/00/12.14.00
I don't care as much if some researcher only reads parts and pieces of papers that they cite, but when the entier papers dissappear, that's a much bigger problem.
"The study, using term papers between 1996 and 1999, found that after four years the URL reference cited in a term paper stood an 80 percent chance of no longer existing."
Is Slashdot written to the maxim "no news is new news"?
Charles Darwin is known to have cited other people's work that he hadn't read (I forget the name of the author involved - not being in the field myself). Then there was the entire field of molecular biology in the 1990s, which suffered more scandals than a dyslexic shoe factory.
Slightly more relevant (though still stretching back decades) is that some authors don't read the papers they co-author - look at all the people who co-authored papers with Jan Schoen, the team who, with Ninov, "discovered" Ununoctium, etc.
Next you'll be telling us that (shock! horror!) some scientist pass off other peoples' work as their own, with a fascinating NEW revelation about Rosalind Franklin's work in the discovery of the DNA structure.
Anyone who see how sme articles are written, knows perfectly that "bibliography" is usually created as a "necessary evil". Most scientific articles are done basically in the light of several "obligatory templates": abstraction, main article, citations, bibliography and notes. Frequently, real authors are not the ones you see first in the header of the article but someone in the end of it. Also, sometimes, certain people do the most flagrant plagiates out of the work of their students or co-workers.
What I call "academical science" is full of huge problems, which sometimes reach the level of flagrant falsifications and demagogic manipulation of facts. While not being a scientist per se, I have seen how these things pass the limits ethics and moral in such a thing like Mars. There is one scientist who tragically died in a very strange situation. Apart of the conditions of the tragedy, there was one big "authority" on Mars who lied with all his teeth about the work of his deceased colleague. Frankly, it was shocking to see how this guy flagrantly and demagogically "reinterpreted" the intentions of the scientific work of his colleague. One should note that both guys were highly considered in the community. However, they were adversaries. One died, the other became a big scientific authority on Mars. One of the reasons, was that he made a lot to desmise the works that went against his theories
This paper takes some very simple statistical models and turns them into what seem to be totally unfounded generalizations about the way science is done. Taking their statistical conclusions at face value, we find that 77% of the people who cited the paper didn't read it in its original form. But, they go on to conclude that a) the only source of information about the paper could have come from a single other paper (namely, the paper with the original citation), and b) misunderstandings about the conclusions drawn by a paper will spread "like wildfire." They do not actually demonstrate this latter conclusion, and don't show that any of the papers actually did misconstrue the science in the original paper.
This is because heavily cited papers become very widely known and understood. Not everybody who's ever cited "The Origin of the Species" has read the whole thing, but it certainly then does not follow that they took their understandings of its conclusions from a single other citing paper.
They end their article with a smug admonition to "read before you cite." These guys sound like the guy with a clean desk who never gets anything done complaining about all the clutter on your desk. Smug social scientists criticizing physicists for their lack of citation rigor does not impress me. There are plenty of better reasons to criticize physicists this year (e.g., Ninov and Schoen). This one seems a bit silly.
* mild mannered physics grad student by day *
* daring code hacker by night *
http://www.silent-tristero.com
Well, it may be a troll, but the fact remains that some very prominent scientists claim that the HIV -> AIS theory wasn't proven before it was adopted.
Their claim is that this exact thing happened in the early 80's, and that instead of actually reading the research that said that HIV may cause AIDS (which was inconclusive) they simply took the ball and ran with it, causing years of research to be based on the same incorrectly cited source.
Who knows what the answer is, but it's a fascinating subject to read up on.
This doesn't come as news for me..
As a student starting my PhD studies, I once asked a researcher at the department about a paper. He told me he hadn't read it.
The next day, I saw that he had indeed quoted that paper in one of his.
However, it usually isn't such a big problem,
when papers are cited without being read, since it usually only happens with papers periferial to the subject.
(For example to justify a certain method or procedure that is common practice)
Also, sometimes the relevant portion of an paper can be summed up in one sentence, or in the abstract.
Copying a reference string doesn't mean that you haven't read the paper in question. To take a personal example of what I've done:
1. Find a reference to a paper which looks interesting.
2. Walk down to the library, remembering that you're looking for Bob's paper about bars in the Journal of Foo.
3. Arrive in the library, find the paper, read it, decide it is important.
4. Walk back to computer, copy out reference string.
It's quite easy to look up a paper from a slightly-wrong reference, and as long as the reference is close to correct, it's fairly easy to not realize that the reference was wrong in the first place.
Tarsnap: Online backups for the truly paranoid
Slashdot readers don't even read the articles they cite... What's this world coming to?
You see? You see? Your stupid minds! Stupid! Stupid!
Anyway, this comic seems appropriate.
Gee ... most scientists use a program (like Endnote) to format bibliographies, using data downloaded from a database (like PubMed). I suspect that this is more a deficiency in proofreading reference lists and assuming that databases are correct, rather than a lack of reading the original material. Whether people read articles carefully is another matter, of course.
In fact, a blatant miscitation of a given reference would often get caught during the peer review process. This happened to me once when I rewrote part of a paper and forgot to remove one of the references that no longer applied ...
To support the view that observations got better and better, requiring more and more circles, you'll probably find most of these sources citing a book by J.L.E. Dreyer, written in the beginning of the previous century, but it exists in a few editions published later.
But Dreyer says the opposite:
Basically, if these people had actually read Dreyer, we wouldn't have had to struggle with this myth any longer. Of course, there's a lot more to this story than this, but I don't have time to write it now... :-)
Employee of Inrupt, Project Release Manager and Community Manager for Solid
Doesn't this point to a failure of the peer review process? Aren't the reviewers bothering to check whether the references are relevant, and for the ones that are, whether the paper actually interprets and builds on the prior work in a reasonable manner?
As far as twisting up evidence, yes, this does happen. But most definitely not 100% of the time. How was the solar neutrino problem ever discovered in the first place? How was a re-evaluation of the cosmological constant initiated? These (and many other ideas) were brought forth not because someone wanted their ideas to be put forth, but because their hypotheses did not match the experimental data! It most definitely is not bullshit. AFAIAC, science is still the most altruistic of professions, not to mention one of the most self-sacrificing.
I am probably one of the few people out there who has ever leafed through academic journals for fun. Still, those things are incredibly boring.
The issue here is that people expect articles to have a certain shape, form, and style, including a literature review. And a lit review can be a pain. You don't want to read an article more than is required to get the basic gist of its relevance to your work. Sometimes, that can be done by reading just the abstract.
The suggested rate of non-reading articles is also possibly overstated. That one has mis-cited a work does not necessarily mean that one has not read it. I can, for example, read an article ten years ago and remember the basic meaning I need to take out of it, and include it in my own references upon seeing it in the references of another's work without refreshing my knowledge of the work. Or I could just use another work's references as a reading checklist and not bother to correct it (or be unaware of the mistake if I sent a poor grad student or some other lackey to the library to copy the journal for me).
I assume the full article by Simkin and Roychowdhury probably states the likely sources of commonly copied errors. I'm a tad curious to se whether the authors of those progenitor articles propagated their own mistakes in future articles or if they corrected them.
While the article claims that "a billion different versions of erroneous reference are possible," in practice that may not be as true. With the errors being volume, page, or year, the most likely errors are transposition of two digits, deletion of a digit, insertion of a digit, or replacement of a digit. In the latter two, the error will most likely be the use of a neighboring number on the keyboard. A one is much less likely to be replaced by a nine than by a two. That is unlikely to lower the probably number of copied citations to below 50%, but it is still a possible source of error that may or may not be accounted for.
I wanted to take up the point in the article that many researchers are more interested in publishing than in solving the issues they investigate. I'm going to preface this by stating that I'm a psych. major and, as such, do not have much knowledge of the specifics of other fields, but I assume their requirements are similar.
In university settings, it is all about how many papers you have published. When a professor is first accepted to the faculty of a university, he/she must "publish or perish" for the first 5(+[?]) years. If you do not publish often enough in those first years, you are not retained. Things get better after you get tenure; you are not required to publish as often. So, it should not come as too great a surprise if people are more interested in publishing than solving the issues.
I personally think the requirements of universities should change so that we are not searching through a glut of papers, all saying many of the same things (or close enough). I am more concerned with the falsification of data, which totally throws everything off, than with a tendency to publish papers that don't necessarily solve the issues, which makes finding relevant research difficult but shouldn't substantially hurt the future of the field.
"I swear I won't break you if you let me take you where the willows never weep" -- Switchblade Symphony
Um, OK. I'll try it:
/. headline
1. Read
2. Form angry, uninformed opinion.
3. Post
4. ????
5. Karma!
Doing science for the money is like having sex for
the exercise. There are many other ways to make considerably more money that require
far less work. The raison d'etre of science is the joy
of discovery; no one spends 6-8 years in higher education
getting a PhD just for the paycheck. People do it
because they love it.
As far as scientists faking results, yes, it happens.
However, the beauty of the scientific method is that
it is self-policing. Anyone can read the journals;
anyone can write the editors of said journals and
report anything that's not above board. As for papers
not being read in the first place, well, let's hop on
the Magic School Bus and take a quick tour of the
scientific publishing process.
First, write the paper. Then, submit it to either a
journal or a conference. In either case, the pool
of available papers will be divided over the number
of people on the review board of the respective
journal/conference, so a bunch of people read a few
papers. Once here, the aforementioned paper is either
rejected or accepted. If accepted, it is published.
After the paper is published, other scientists read
the paper. If it is useful for their work, they may
incorporate some of the ideas into their own work,
at which point, they'll test the idea that they're
borrowing to see if it makes sense.
If it does make sense, they'll use it. If not, they'll
tell the whole world, discrediting the work and
embarassing the original author. Thus there is plenty
of pressure to do good science. The people doing legitimate
work far outnumber the charlatans just submitting
gibberish.
Matt
Sometimes all someone wants is a certain result from a paper. Reading and understanding the full reasoning behind a result rather than the result itself may mean the difference between an afternoon of work and 3 weeks of work. Multiply that by the number of citations a paper has, and a hapless but well-meaning scientist would spend all their time digesting their citations rather than publishing papers and would soon be relieved of their position.
Understanding the details behind cited results is certainly very important, but in the real world there are real tradeoffs that researchers constantly have to evaluate professionally regarding how much time they spend understanding and in how much detail they understand any given result.
This posting is interesting, certainly, but it is not news.
-- My choice of computing platform is a symbol of my individuality and belief in personal freedom.
...most scientists don't read the papers they cite. This means that if one paper misreads a work the misreading propagates.
If they're not reading the papers, why would it propagate?
I've grown tired of hearing members of the so-called 'medical' profession lecture me on how 'risky' my 'high-protein' diet is (seems most doctors are functionally deaf and/or immune to learning anything at all from a non-doctor). I gotta wonder how much more 'risky' my MODERATE protein is than being more than 100 lbs overweight. Seems doctors only read the conclusions of studies, and not the actual studies. I have come to the conclusion (based on my personal experience, and comparing notes with several dozen others in the same situation) that the typical 'research' paper follows these steps:
1: Write down a conclusion
2: Write a paper supporting that conclusion
3: Do some 'research', carefully structured to support that conclusion
4: Discount or discard any data that doesn't support that conclusion
5: Get the paper reviewed by a group of associates that agree with your conclusion
6: Publish the paper in some mutual-admiration society journal
My favorite along these lines is one entitled "Type 2 Diabetics Benefit From Reducing Intake Of Animal Protein". If you read the summary very carefully, you will see that the 'researchers' removed the SUGAR from the diet, and then concluded, from the resulting health improvements, that animal protein causes type II diabetes. (!!) This is, unfortunately, typical of what passes for 'science' in the study of diet.
Concealed Handgun License Courses in Plano, Texas
Take a look at the ACM or IEEE and the number of journals they support, then toss in folks like Springer Verlag. Figure out how many articles are published in these each year. Just from counting you might determine that many of these are pretty meaningless. Try reading a few at random and see if you change your mind.
Now remember that the folks on a tenure/promotion committee know nothing about what a researcher might do - they're even more ignorant of the research field of someone else than they are of their own. So, how do they determine how good a researcher might be? They're sure as hell not going to wade through yet another meaningless paper. Its simple. They count. How many publications? How many grants? How many citation from other papers to the researcher's papers?
And its an interesting feedback loop: even getting a publication or grant can depend on your publication and grant history. And if you suspect that someone might be reviewing your paper/proposal who works in the same area, you might want to make sure there are a couple of citations (always positive, naturally) of that persons work included.
So, we know someone wants publications/grants/citations and they need p./g./c. to get p./g./c.. They do some research, it depends heavily on two or three other bits of research. But two or three citations aren't enough. So they might want to use the citations they find in the work they cite. OK. This citation looks good perhaps, but the original article isn't available in the local library and inter-library-loan will take a month to get it and the deadline is next week. Oh well. Cite away - the original author isn't likely to complain (after all this is another citation to his/her work).
And so it goes.
As someone who has written a number of scientific papers (and yes, sometimes, but not often, cited articles that I haven't read), I think there are a couple of reason contributing to the problem:
1) Cost of journals -- often there is an article that ought to be cited in your work (because it was published before yours, and is related), but is in a journal unavailable at your university's library. There are thousands of journals, and their high costs (often thousands of dollars a year each) means that no library can have them all. But why not simply ignore an article you haven't read? Read on.
2) Pride of Reviewers -- When a scientific article is sent to a journal, it is passed on to several researchers who are doing similar work for peer review. While it would nice to think that reviewers are not so petty, the fact is, if you haven't cited their work, they might get angry and reject the paper. So, authors feel that it is better safe than sorry and cite freely.
The real problem here is inherent in the academic system. Research faculty are in a situation where they are being judged by the amount of papers they put out, and not on the quality or the potential of their work. This leads to unscrupulous individuals doing "whatever it takes" to get ahead.
What needs to be done is to reform the way merit is assigned in academia. Research funding and tenure need to be allocated based not only on the quantity of publications but on other factors which may be harder to measure, factors that would be better indicators of the value of their research.
A somewhat related issue is that more and more private sector funding is flowing into universities and along with that funding comes the expectation of a quick return on investment. This creates more pressure to pursue short-term goals with little long-term impact on the field of study.
Taken together, US scientific research is destined to fall behind and stop making new breakthroughs. Seemingly, the only apparent solution to this is to increase the amount of public funding available for basic research. It would seem, though, this is not likely to happen given the current regime in Washington. A more likely outcome will be that our scientific institutions will all be doing R&D for the big corporations in the near future.
A lot of the ultimate problem is that many in research are concerned more about publishing than in solving the issues they investigate.
The problem is that the higher-ups in the university system essentially mandate a certain number of peer reviewed publications for promotions, hell even to keep your job if you're not tenured. This, I feel, is part of the problem in that we're pushed so hard to get X number of publications per year. In a sense it's necessary to weed out the smucks (anyone can get a Ph.D. nowadays), but it also can cause the quality of the research to decline. The whole quality vs. quantity argument.
Just my $0.02.
I think people forget that the Hard Sciences are made up of people, same as the social sciences, and also have the usual problems associated with using people to try to get stuff done. (Although I'm not sure I'd put not reading all of the papers you site real high on the list - if all you're after is one point in a long and complex paper that seems like a fairly inefficient use of time. Some of these papers are HARD to understand.)
What gives the Hard Sciences the right to that title is that, eventually, someone will root out the bull that someone else has published, brand it as such, other people will check it and agree, and it dies. You can prove someone WRONG. Try that in the social sciences - has anyone ever heard of a huge scandal where someone faked results in the social sciences? They would get in trouble if they didn't do the studies and were found out, but can you prove that they cheated just by taking their conclusions, working with them, and crying foul when something doesn't work? In the Hard Sciences, you can. That's what makes them so strong and practical.
Not that Social Sciences are worthless, mind you. It's just that BS seems to be a lot easier to get away with there. Sort of like in English class, when we were supposed to get the meaning out of a book. I never get the meaning the author's trying to convey (or at least what they say later he/she was trying to convey), but I wrote down something and got a good grade. Because how could they prove my thinking about the book wrong? I think the social sciences have a little of that problem in them somewhere. Controlled experiments are really tough to do, so you run into problems.
"I object to doing things that computers can do." -- Olin Shivers, lispers.org
I'm a PhD student in Literature (I know...) and although there's definitely a bit of a problem in the Humanities with people not responding to others in a useful dialogue at times, and there is certainly the same "publish or perish" imperative, it is really a *huge* faux pas to not have read the entirity of the paper/book you cite. In my field, you can easily be discredited for your entire academic carrer for that sort of thing.
Incidentally, it seems to me that the peer review process that exists in both the humanities and the sciences ought to catch these people who are completely misreading their source material. If neither the people writing the papers nor the reviewers are familiar with secondary materials, a real problem exists.
"I do not fear computers. I fear the lack of them." -Isaac Asimov
The logic benind the articel is very, very weak. The basis of the article is that misquotes in citations (wrong volume, page number etc.) propagate from one paper to another. Whech shows that the authors cut-and-pasted citations from earlier papers. Sure. But the researchers quoted claim that this means that the researchers didn't read the papers concerned. Rubbish.
During the reserch shage of a project, you read the papers. Error in th citation - no sweat; you know authers and title, and a search engine will give it to you in nothing flat.
Weeks or months later, it is writeup time. Open the first paper to cite it. And there are all the other references you followed (a little trouble in the lookup is long forgotten) and dutifully read. And - get this - it is easier to cut-and-past the citation than to go back to the paper and assemble - separately - the publication, title, authors and page numbers.
Then only thing the research quoted proves is that papers are overwhelmingly circulated electronically ans the dead tree format is, for scientific papers, obsolete.
Consciousness is an illusion caused by an excess of self consciousness.
Look, as someone who's written scientific papers, the claims in the article are not only false, but indicative of poor science themselves. They're making the classic experimental stats mistake. Namely, copying and pasting citations from other sources is *absolutely uncorrelated* with whether those papers have been read by the author.
i es
Formatting citations is fussy, tedious, and annoying. You have to look up the page numbers in the journal (which you may not even have in these days of online papers), figure out who the publisher was, the issue or journal number.
I read every single one of the papers I've ever cited. But it was rare that I ever typed in a citation from scratch. Usually you get them either from an on-line citation database, from the bibtex entry helpfully supplied on the cited author's web page (scientists like being cited!) or, yes, by typing out a citation from a printed paper.
In any given field, usually some kind-hearted soul starts collecting a database of citations for others to use. For instance, here's one here:
http://www.helios32.com/resources.htm#Bibliograph
Have a look; you'll soon twig to why people don't type these in from scratch.
Creating the citation all over from scratch when it's right there in front of you is about as pointless as adding a link to a web page by retyping some monstrous 200-character URL. Just because you copy & pasted a link doesn't mean you didn't read the article did you? (I guess slashdot is the wrong place for that particular piece of rhetoric.)
I'm disappointed in New Scientist. The pissy little diatribe about science in the story submission is par for the course. Please, leave the pontificating to people who have a clue.
In fact, how about a retraction? (Ha ha ha ha!)
A.
Yes, that is the tinfoil hat explanation.
Now try this one: authors are human beings who make typos. They cut and paste erroneous references because they don't want to waste time retyping the reference. They read articles from the online versions of journals, and sometimes the citation info provided online is incorrect or altogether absent.
One thing that does disgust me is the explosion in the number of footnotes associated with a typical academic paper these days. I recently submitted a paper with a not-particularly-important result to a not-very-important journal, and the paper had forty-one footnotes. (Most were added by my coauthor.) If you visit an mature university library, pull out a copy of an older periodical. Copies of Philosophical Transactions from the nineteenth century are a delight to read. I read a paper by Kelvin from (IIRC) 1807, and it had seven references. Seven!
The growth of massive, searchable databases of papers (eg Medline) has led to many more footnotes per paper, and many more potential typos. For the record, the paper I mentioned above contained at least three errors in the footnotes that were noted and corrected by the journal publisher. Perhaps New Scientist should be writing a scathing expose on the decline of proofreading and rise of profligate namedropping in footnotes.
~Idarubicin
This wasn't a hick lawyer either.. She was senior partner in one of the largest law firms in BC, had a reputation for never losing a case, and became a judge a year or so later (Judgeship is more of a peer-review process in Canada than it appears to be in the US).
This left me with a feeling that lawyers don't pay as much attention to their authorities as they could. Probably more so than scientists do with their citations.
OS Software is like love: The best way to make it grow is to give it away.