Elsevier Opens Its Papers To Text-Mining
ananyo writes "Publishing giant Elsevier says that it has now made it easy for scientists to extract facts and data computationally from its more than 11 million online research papers. Other publishers are likely to follow suit this year, lowering barriers to the computer-based research technique. But some scientists object that even as publishers roll out improved technical infrastructure and allow greater access, they are exerting tight legal controls over the way text-mining is done. Under the arrangements, announced on 26 January at the American Library Association conference in Las Vegas, Nevada, researchers at academic institutions can use Elsevier's online interface (API) to batch-download documents in computer-readable XML format. Elsevier has chosen to provisionally limit researchers to 10,000 articles per week. These can be freely mined — so long as the researchers, or their institutions, sign a legal agreement. The deal includes conditions: for instance, that researchers may publish the products of their text-mining work only under a license that restricts use to non-commercial purposes, can include only snippets (of up to 200 characters) of the original text, and must include links to original content."
Time to disprove some punks.
Isn't this called search engine spamming, and several publishing outfits have been doing it for about a decade, with varying degree of success?
1. Please generate as many sales leads as you can 2. Profit!!!
Wake me up when I can get all those taxpayer-funded IEEE papers online for free. *grumble*
Publishing giant Elsevier says that it has now made it easy for scientists to extract facts and data computationally from its more than 11 million online research papers. Other publishers are likely t
Get Watson over here will you?
Time flies when you don't know what you're doing
If the Internet is killing newspapers, why isn't it killing this dead tree company?
I like this bit from TFA:
Shillum says that Elsevier is ahead of the curve — but that other publishers are likely to follow soon. CrossRef, a non-profit collaboration of thousands of scholarly publishers, will in the next few months launch a service that lets researchers agree to standard text-mining terms and conditions by clicking a button on a publisher’s website, a ‘one-click’ solution similar to Elsevier’s set-up.
I would like to see that.
... publishers removed the paywall to publicly funded literature, or at least made the prices more sane.
Also, while we're on the topic of text mining, would it be possible to get text-only or xml-based articles, with figures attached and cross-references as needed? It's quite annoying to manually convert a pdf when trying to setup an automated analysis over several documents. I know one could setup a shell script to dump it out using the pdftoxml converter, but the output is a bit messy to parse.
Lawyers for Amazon are envisioning enlarging their swimming pools...
ALA Midwinter was in Philadelphia, PA this year. The upcoming ALA conference this summer will be in Las Vegas.
Soon...once the exclusive contracts and the End User LIcense Agreements expire, the users will revolt. It was foretold in the Scientific Prophecy of Rebirth.
Oh nevermind, I just noticed that he charges money instead of twinkies. 120 euro, or 163 dollar, per hour. Lordy..
that should have been pruned long ago.
Haha, back in the 90's, I worked at a company that built some websites for Elsevier. The effort was overseen by a young Dutch woman who came to our offices and wanted to know why we didn't have orange juice and buns for her every morning.
We designed a background image that looked great at normal viewing distances from the screen, but when seen from far away it looked like it really said "GReed-Elsevier". The sites went public, but we were made to change the background about a week after launch.
Acording to “Why you and I should NOT sign up for Elsevier’s TDM service“ [0], this is not all that good, as the Text and Data Mining policy is actually overly restrictive. Most notably, it forces you to go through their API to do the work, rather than parsing things locally at your leisure, and imposes conditions on the release of the uncovered data (namely a non-free CC-NC).
[0] http://blogs.ch.cam.ac.uk/pmr/...
nuff said really
Note:
If you have to sign or agree to something in order to access it, it's not free, even if they say otherwise.
Why? Just look at the never ending list at https://en.wikipedia.org/wiki/...