Slashdot Mirror


Tearing Down Science's Citation Paywall, One Link at a Time (wired.com)

Citations play an incredibly important role in academia. To scientists, citations are currency. Citations establish credibility, and determine the impact of a given paper, researcher, and institution. However, the system of how citations work is crippled with a problem. Over the last few decades, only researchers with subscriptions to two proprietary databases, Web of Science and Scopus, have been able to track citation records and measure the influence of a given article or scientific idea. This isn't just a problem for scientists trying to get their resumes noticed; a citation trail tells the general public how it knows what it knows, each link a breadcrumb back to a foundational idea about how the world works, reads an article on Wired. The article adds: On Thursday, a coalition of open data advocates, universities, and 29 journal publishers announced the Initiative for Open Citations with a commitment to make citation data easily available to anyone at no cost (alternative source). "This is the first time we have something at this scale open to the public with no copyright restrictions," says Dario Taraborelli, head of research at the Wikimedia Foundation, a founding member of the initiative. "Our long-term vision is to create a clearinghouse of data that can be used by anyone, not just scientists, and not just institutions that can afford licenses." Here's how it works: When a researcher publishes a paper, the journal registers it with Crossref, a nonprofit you can think of as a database linking millions of articles. The journal also bundles those links with unique identifying metadata like author, title, page number of print edition, and who funded the research. All of the major publishers started doing this when Crossref launched in 2000. But most of them held the reference data -- the information detailing who cited whom and where -- under strict copyright restrictions. Accessing it meant paying tens of thousands of dollars in subscription fees to the companies that own Web of Science or Scopus. Historically, just 1 percent of publications using Crossref made references freely available. Six months after the Initiative for Open Citations started convincing publishers to open up their licensing agreements, that figure is approaching 40 percent, with around 14 million citation links already indexed and ready for anyone to use. The group hopes to maintain a similar trajectory through the year.

10 of 50 comments (clear)

  1. I feel like I'm missing something. by Gravis+Zero · · Score: 2

    ...convincing publishers to open up their licensing agreements, that figure is approaching 40 percent...

    “It’s not that much actual work to do it, it’s just about flipping a switch and getting publishers to agree to releasing this data,”

    But when the publishers see what the Initiative for Open Citations is doing, won't the publishers just terminate the licensing agreement because they are potentially cutting into the publishers' profits?

    --
    Anons need not reply. Questions end with a question mark.
  2. How about Google Scholar? by Yergle143 · · Score: 2
    1. Re:How about Google Scholar? by geek42 · · Score: 2

      Google Scholar counts citations too and it is free.

      Totally agree, google scholar, also Researchgate. From the article: "This is the first time we have something at this scale open to the public with no copyright restrictions". Hm.. Sceptical.

  3. Publishers are the problem by sinij · · Score: 2

    Most scientific journals are published by for-profit organizations that in turn lock down submissions they publish with copyright. These publishers don't provide grant money to do research, they don't pay peer reviewers who are volunteers, they may pay something to the editor. This setup made sense back in the era of dead trees and snail mail. This makes no sense today with Internet.

    Once we fix this problem, we can start fixing other problems. Such as reproducibility - if you don't have inbread editors it will be possible to publish confirmation or refutation of findings instead of "novel" research. If you don't have a paywall, non-academics will be able to access this mostly government funded research and actually flag bogus or wrong studies. If you don't have unaccountable editors deferring to the list of approved peer editors, then you will have critical questioning of the work instead of groupthink. Anyways, we should also always publish names of peer reviewers - they should too be held accountable for published work.

  4. What's taking so long? by Solandri · · Score: 2

    It's only been 28 years since Tim Berners-Lee proposed a method of information storage and retrieval for exactly this purpose. His work was done in the wake of the Fleischmann-Pons Cold Fusion announcement in 1989, which saw scientists sending faxes of faxes of faxes of the draft journal paper to each other so they could try to replicate their experiment. He figured there had to be a better way. His proposal grew into the World Wide Web, as seemingly everyone adopted and embraced it except scientists publishing papers - the very people Berners-Lee had in mind when he created it. In the intervening 28 years, we've even seen a new company whose sole purpose is to provide people with real-time spot-rankings of citation links created under that proposal, grow into one of the most powerful in the world - Google.

  5. Re:Sad by godrik · · Score: 2

    Well, I have to disagree. People look at citation to get a quick and dirty idea of how popular a paper or a researcher is.
    But that is only really used as a first initial filter. Most universities look at impact. And citations can be used as a proxy for impact, but really impact is what people are looking at.
    Citation patterns are quite important to understand the structure of a field. And be able to mine the work automatically. So I am quite happy to see an effort to make these data more public.

  6. Re:Sad by 110010001000 · · Score: 2

    How does code review convince anyone the code works? It just means someone else looked at it, but didn't try to run it.

  7. Re:Ironically, the article linked is behind an adw by ncc74656 · · Score: 2

    The article linked by this story blocks its contents unless you turn off ad-blockers or agree to pay a fee.

    archive.is gets past many adwalls, including whatever Wired is using. GGBlocker automatically redirects Wired links (among others) to archive.is for me whenever they pop up...you can view the archived article here ad-free, whether you have an ad blocker active or not.

    --
    20 January 2017: the End of an Error.
  8. Re:Sad by Altrag · · Score: 2

    "Real" science:

    a) Fits all currently available data, within a margin of error (and the margin of error needs to be specified.)

    b) Is falsifiable. That is, it must always accept the possibility that some new data will come in tomorrow that breaks the theory. That is why religion can never be science -- "God did it" is always an acceptable answer no matter what happens and therefore the "theory" is not falsifiable. That doesn't imply that there is no God or anything that extreme -- just that God's existence (or non-) is something that can't be simply talked about scientifically at all.

    c) Makes inferences/predictions about future events. This somewhat relates to (b) in that a prediction that doesn't come true immediately breaks the theory. That said, the predictions can sometimes have very long time frames. The Higgs boson for example was predicted in the 60s and only shown to be true (with high confidence) a few years ago -- a good 50 year gap give or take. Climate change theories are looking like similar or even longer predictive time scales (but much worse for the world if the predictions end up being true!)

    That's really all there is to it. The citation idea mostly goes towards fulfilling (a) -- "all available data" is often a huge huge quantity of stuff to work with, so its often easier to work with an existing theory that already deals with most of the data and then tweak it a bit to encompass a bit of extra data and make new predictions rather than starting from scratch every single time.

  9. Aaron Swartz - JSTOR by Trax3001BBS · · Score: 2

    Died making this easier https://en.wikipedia.org/wiki/...

    JSTOR is still available, I got it years ago.

    If his name is fuzzy he founded Reddit.com