Slashdot Mirror


Free Online Scientific Repository Hits Milestone

ocean_soul writes "Last week the free and open access repository for scientific (mainly physics but also math, computer sciences...) papers arXiv got past 500,000 different papers, not counting older versions of the same article. Especially for physicists, it is the number-one resource for the latest scientific results. Most researchers publish their papers on arXiv before they are published in a 'normal' journal. A famous example is Grisha Perelman, who published his award-winning paper exclusively on arXiv."

33 of 111 comments (clear)

  1. I Am Forever in Debt to Arxiv by eldavojohn · · Score: 4, Interesting

    When I was a freshman at the University of Minnesota, a professor instructed us to use Arxiv as a resource (I think Citeseer was another but paled in comparison). A large part of my undergrad and grad school days were spent perusing Arxiv and sometimes implementing ideas I had read in the Computer Science section. My hard drive became strained by the sheer number of PDF/PS files in my user directory. My room was littered with papers printed off to read on the bus or at work. My base knowledge of computer science I owe to my professors, most of the things beyond that came from Arxiv.

    I owe a lot of my knowledge to that site. Here's to another 50,000 papers, Arxiv. And another and another and another ...

    Also, the Arxiv Physics blog is a regular favorite in my Liferea news feed account.

    --
    My work here is dung.
    1. Re:I Am Forever in Debt to Arxiv by CRCulver · · Score: 3, Interesting

      Perhaps Arxiv works for the hard sciences, but for the social sciences and humanities giving people access to an online repository of papers doesn't necessarily mean that they can easily stay up to date with the field. I get the impression that a lot of current thought in the community of my field (linguistics) is passed on through relatively private e-mail lists and informal discussions at conferences, and might not be written down and published for years.

    2. Re:I Am Forever in Debt to Arxiv by vrmlguy · · Score: 3, Interesting

      My room was littered with papers printed off to read on the bus or at work.

      A good reason to buy an Amazon Kindle/Apple iPhone/Sony Reader.

      --
      Nothing for 6-digit uids?
    3. Re:I Am Forever in Debt to Arxiv by Tubal-Cain · · Score: 2

      If it's so informal, I have to wonder how Sociology and Literary Criticism stay up-to-date.?

    4. Re:I Am Forever in Debt to Arxiv by aproposofwhat · · Score: 2, Funny

      Ah, so you're working in the oral tradition, then?

      --
      One swallow does not a fellatrix make
    5. Re:I Am Forever in Debt to Arxiv by bjorniac · · Score: 3, Insightful

      Well, this is the problem of perception a lot of people have - that scientists are the anti-social ones. Scientists cannot work in a vacuum - we need communications with one another, interactions and a knowledge of other work to get on with our own work. You build off other people's work, use the things recently discovered to move your own work forward, so you need to have constant fast communications of the latest discoveries. Good physicists are always talking to one another, asking about work done, clarifying points and collaborating - just check out how many of those papers have multiple authors, often at separate institutions.

      Compare this to a social science/humanity subject where sitting in your ivory tower is basically encouraged, with publications of great single-authored treatises seemingly the only output. They don't need to talk to one another and many are outright hostile to any discussion of their work.

      Disclosure: I'm a physicist with an SO in the humanities. The differences in our experiences are incredible - people in my department like each other and work together.

  2. i'm the first to comment by Anonymous Coward · · Score: 3, Insightful

    i'll beat all the cynical punch savvy posters to the punch!

    that comma is in the wrong place, i see 50,0000. I guess they need another article on properly writing numbers.

    1. Re:i'm the first to comment by Geoffrey.landis · · Score: 4, Informative

      >that comma is in the wrong place

      Right. The correct number is 500,000 (not "50,0000").

      arxiv.org actually says 497,649 as of a moment ago).

      --
      http://www.geoffreylandis.com
  3. Re:50,0000? by vrmlguy · · Score: 5, Funny

    It's half-a-million. CmdrTaco doesn't deal with such large numbers very often.

    --
    Nothing for 6-digit uids?
  4. There are interesting differences by mbone · · Score: 5, Interesting

    Here are some in fields I follow :

    In astrophysics, almost all new papers appear first in Arxiv.

    In planetary physics, some but by no means all papers appear in Arxiv.

    In geophysics, basically no papers appear in Arxiv.

    I don't know why there are these differences, but there it is.

    1. Re:There are interesting differences by 16384 · · Score: 4, Informative

      Condensed matter physics and high energy physics also have a large presence on Arxiv. As you say, it depends largely on which branch of physics you deal with.

  5. It's science by Anonymous Coward · · Score: 5, Funny

    If it's a science publication, should it have hit a kilometer-stone instead of a milestone?

  6. Re:Also #1 for mathematicians! by Gromius · · Score: 4, Insightful

    Likewise, every particle physicist also puts his paper there before they are published (my three are all there). While it is great as a source of open information, one thing to bear in mind is that it is not peer reviewed, *anybody* can stick *anything* there. This is the major reason why we still unfortunately need paper journals. We need somebody to read it and say yes this follows basic scientific procedures and to the best of his/her knowledge there are no mistakes. Because theres a fairly low signal to noise on arXiv and whats there is not guaranteed at all to be of proper scientific merit and correctness.

  7. 500,000+ articles by MosesJones · · Score: 5, Funny

    But the question we are all asking ourselves is

    Who got the first post?

    The answer is Exact Black String Solutions in Three Dimensions by James H. Horne and Gary T. Horowitz

    Slightly better than the "Fkrst Pist" attempts on Slashdot!

    --
    An Eye for an Eye will make the whole world blind - Gandhi
  8. How significant by igotmybfg · · Score: 2, Insightful

    Because quantity == quality...

  9. Re:50,0000? by $RANDOMLUSER · · Score: 2, Informative

    Well, according to TFsite, "3 Oct 2008: arXiv passes half-million article milestone", so that would be 5 * 10^5.

    --
    No folly is more costly than the folly of intolerant idealism. - Winston Churchill
  10. Hopefully this helps... by ruin20 · · Score: 3, Insightful
    ...convince the scientific journal community that they should open their standards and let articles published in their journals to be republished by the author else where.

    I'm not going to pretend 50,000 is a lot, but the fact it's 50,000 and growing should make them worry. I hope the celebration of this milestone will help accelerate it's growth so we see 100,000 sooner than later. The quicker pay-for-access science disappears the better for all of us.

    --
    Oh honey look... How cute... an angry slashdotter!
    1. Re:Hopefully this helps... by Hikaru79 · · Score: 2, Informative

      The summary misplaced a comma. The actual total is 500,000 not 50,000.

  11. Re:Also #1 for mathematicians! by Anonymous Coward · · Score: 2, Informative

    one thing to bear in mind is that it is not peer reviewed, *anybody* can stick *anything* there.

    This is true. However, they do have a group of moderators which recategorizes what they think are "merely mediocre, speculative, or erroneous articles". See http://front.math.ucdavis.edu/ifaq#nonsense

    Of course, this is not the same as peer-review, but at least it's something.

  12. Re:Also #1 for mathematicians! by Aalst · · Score: 2, Informative
    Allyn Jacskon (editor of Notices of the AMS) has published an interesting article about the impact of preprint servers on mathematics: http://www.ams.org/notices/200201/fea-preprints.pdf
    In it he writes:

    As an experiment, Greg Kuperberg looked at the publication status of the first 100 papers in theoretical high energy physics posted to the arXiv in December 1998. He found that 81 had appeared in journals, 11 were conference proceedings or invited lectures, and 2 were Ph.D. theses. "Thus at least 94 of the 100 have been blessed by some form of peer review," he concludes.

  13. Fifty ten-thousand? by Guysmiley777 · · Score: 2, Funny

    Wow, that's a lot of ten-thousands of papers!

    --
    Coding with assembly is like playing with Legos. Coding an application in assembly is like building a car with Legos.
  14. Re:What about peer-review? by Geoffrey.landis · · Score: 2, Informative

    Peer review is great for some things, but just ask Galileo how 'peer review' worked for him. 7 years in a prison as a part of the inquisition. I do realize, that today scientific breakthroughs are treate

    Just a note, Galileo's trial by the inquisition was not a problem of peer reviewing: it wasn't that he couldn't get his work published; it was what happened after it was published.

    --
    http://www.geoffreylandis.com
  15. In other news... by bakuun · · Score: 3, Interesting

    PubMed Central, the central repository for open access Life Sciences research articles, is pushing on 1.3 million articles. These repositories is a wet dream of text mining researchers.

  16. Re:Also #1 for mathematicians! by Dronak · · Score: 2, Informative

    it is not peer reviewed, *anybody* can stick *anything* there.

    I think they've changed things a little bit over time. It does seem like anyone is able to register an account, which would allow them to start submitting papers. But looking at the help pages, I see this on an endorsement system: "Effective January 17, 2004, arXiv.org began requiring some users to be endorsed by another user before submitting their first paper to a category or subject class." They note that this isn't peer review, but it "will verify that arXiv contributors belong [to] the scientific community". They also moderate submissions, and the help page on this topic says: "arXiv reserves the right to reject or reclassify any submission." While also not real peer-review, it "helps to ensure that arXiv content is relevant to current research".

    Perhaps some areas are better than others about self-moderating/reviewing submissions. My experience with the astro-ph archive, which I've read for many years, is that most of it is generally good material, often pre-prints of papers that will appear in peer-reviewed journals or conference proceedings. Not all of it is like that of course, but I think there's a lot more signal than noise in the astro-ph section at least. Just my opinion.

  17. Re:Paper must die by Gromius · · Score: 3, Insightful

    Oh exactly. I think the future is a peer reviewed online journal. I think arXiv provides a very valuable service as is for the distribution of knowledge. Right now it has a copy of basically every particle physics paper published and I assume this is true for some other fields too. Many times I grab the arXiv copy over the journal copy as its more convient. So all the journal does is basically place a peer reviewed stamp of approval on the online arXiv paper and this could easily be replaced with a online journal in the future.

    I am strongly against journal sub fees as I believe which that the knowledge contained in scientific papers (doubly so for public funded ones) should be availible to all and not only accessable to people willing to pay the high cost of a journal subscription fee. CERN is pushing open journals for that very reason and that may evolve into a respected online peer reviewed journal which will compliment arXiv nicely.

  18. Like anything else: quantity and ease of access by Dr.+Zowie · · Score: 4, Insightful

    Because quantity == quality...

    I realize that you were being snarky, but you accidentally hit on a corner of the truth. The real value of the ArXiV is indeed its quantity of results, mixed with the ease of access. The traditional journals typically restrict access to their output -- unless you are at a subscribing institution, it costs $15-$50 to access a single article from a single traditional scientific journal (depending on publisher). At professional institutes and universities, which typically have online subscriptions to journals, it is possible to surf through the Literature (depending on field, back about 10-15 years) and find recent relevant knowledge extremely quickly. If you aren't at an institution that subscribes, you're SOL. ArXiV fixes that - if you publish your article both in a journal and in the ArXiV, most indexing services will notice that it is the same, and suddenly everyone on the planet has unrestricted access. That's a no-brainer for an author.

    The way that professional scientists (like me -- I am a solar astrophysicist) access the Literature has changed drastically in the last ten years. My office has about 12 linear feet of Xeroxed journal articles in three-ring binders, but I practically never refer to them. It's far faster and more convenient to access (say) the entire archives of Astrophysical Journal online than to go "grep dead trees" at the library. Citation indices such as ADS (Google for adsabs) hyperlink both references and citations, so that I can search through 50 articles relevant to a topic in less time than it used to take to look up one article and Xerox it for reading outside the library.

    Old-style pay-to-read journals get in the way of that rapid access - for example, I have rarely cited articles in Astronomy and Astrophysics, because it's a pain in my ass to download them. Until recently, my institute didn't subscribe, so I had to either pay on a per-article basis (which adds up if you are skimming for the one relevant article in a dozen possibilities), or travel to the local university to get the paper I wanted. This is a very common problem: even large universities generally don't subscribe to all the relevant journals in a given field, because web subscriptions cost thousands to tens of thousands of dollars per year per journal!

    For everyone not fortunate enough to have a computer account at a large institute that can actually afford to subscribe to dozens of journals, ArXiV is the best way to access a large volume of the literature. Hence, articles posted to the ArXiV get cited more. That makes authors want to post to the ArXiV as a matter of course. It's a virtuous circle.

    So, er, yes, quantity is quality in this case -- ArXiV was canny and/or lucky enough to get a critical mass of good work, and the quantity is the driving force that keeps the whole thing going.

  19. Re:Also #1 for mathematicians! by Gromius · · Score: 3, Interesting

    But that was 1998 where a) the general population was just getting online and b) pretty much only scientists knew about arXiv. There is a lot of peer reviewed stuff on there (every paper submitted to a journal tends to be submitted) but as more less mainstream scientists have access, you regretably get more noise. Looking at Oct 2007 for hep-th and assuming that it would be mentioned in the summary is its published or going to be published (and trust me people mention this...), out of the first 25, 12 are published in a journal and or conference proceedings. So less than 50% were blessed by some form of peer review. And its the other 50% tend to be the most sensational :)

    Note I still think its very valuable for to have a place where non-peer reviewed material can be uploaded as well as peer reviewed but if its not peer-reviewed its a lot more likely to be incorrect somehow and the reader needs to be aware of that.

  20. The arXiv is great, but..... by moosesocks · · Score: 2, Interesting

    We really need to begin compiling our scientific knowledge into a hyperlinked wiki/database of sorts.

    Wikipedia's great for basic stuff, though there's still gobs of information (much of which is in the public domain) that's inexplicably confined to books and journals.

    Hyperlinks (and extended data sets) should be *standard* for all journal articles these days, given that we have the technology to do so. There's no reason that the arXiv needs to remain as a repository for dead-tree PDFs.

    --
    -- If you try to fail and succeed, which have you done? - Uli's moose
    1. Re:The arXiv is great, but..... by Dr.+Zowie · · Score: 2, Informative

      At some level, hyperlinks (at least) are standard. They're called "references" and were the closest thing to a hyperlink before the intertubes were invented. Several free services (ADS is one: http://adsabs.harvard.edu/ have spiders that walk the literature and create genuine URL-style links between articles. ArXiV is advancing custom along that path, by making many journal articles available for linking to anyone free of charge.

      Extended data sets are coming. Astrophysical Journal allows online publication of movies and data to support articles, and I imagine that ArXiV will one day too. (Though they don't have the server space to support many of the data sets that are written about in those PDFs).

      Meanwhile, most^H^H^H^Hmany scientific authors are happy to give you their original data -- just write to them and ask for it!

  21. Not a search engine. by Anonymous Coward · · Score: 2, Informative

    To clarify, arxiv is a document repository (you submit your papers there). If you want a scientific papers search engine, use citeseer.

    Note that citeseer also indexes arxiv documents :)

  22. Re:Is there peer review? by plusungood · · Score: 2, Informative

    It is NOT peer reviewed, but around half the papers eventually get accepted in a journal or a conference proceeding. It doesn't only contains articles, but also overviews, books and introductions.

  23. XXX.LANL.GOV by Lawrence_Bird · · Score: 2, Interesting

    was the original .. with the skull/crossbones icon. Now its all too easy and happy looking.

  24. Re:Double publishing question? by bjorniac · · Score: 2, Informative

    Here's how it works (for me at least):

    First you write a paper - this is the hard part. Then you can submit it to Arxiv - usually done at the same time as submission to a journal, though some choose to wait for any initial backlash/corrections before doing this. Arxiv normally publishes it the next working day with no peer review (8pm EST the night before) for all to see online. Meenwhile your journal is still looking for peer reviewers. No journal in physics can now ask to be the sole source for any article - all authors have to sign a contract often stating that no other commercial source will exist, but that the author can have a copy on his/her homepage (or other place) for free distribution, and on the preprint archive.

    Double-blind journal review becomes single-blind when you publish on Arxiv - you can't see your reviewer's name but he can easily find yours. This way it's still possible for someone who gets your paper to screw you over if they don't like you, but then again you can appeal to the editors if you think this is happening. Since you don't know your reviewer, you can't exert any pressure on them (in theory) to accept your paper, so the review part keeps most of its integrity. Reviewers are also required to give detailed reasons for rejecting/accepting papers so you really have to justify your reasons not just "I like/dislike this guy".

    In many fields, the blind part of peer review isn't all that blind. Often from the suggestions for citations for example you can get an idea of who your reviewer is. In small fields it's pretty hard not to know everyone in it anyway - especially since you're normally familiar with everyone else' work in your area, and because reviewers are chosen as experts in that area.