Slashdot Mirror


Open Library Project Takes Flight

Aaron Swartz today announced the launch of the new Open Library project. The goal of the project is to produce the world's greatest library on the Internet free for anyone to use. Starting with the Internet Archive's book scanning project and organizing the insertion of new content via a wiki-type model the project seems to be off to a great start. The demo, source code, and mailing lists were all opened up today in hopes of drawing interest from the public at large.

11 of 126 comments (clear)

  1. In response to your question: by CaptainPatent · · Score: 3, Interesting
    FALTWSBTFA: (From a link to what should be the feature article)

    What if there was a library which held every book? Not every book on sale, or every important book, or even every book in English, but simply every book It would probably be sued for copyright infringement.
    --
    Well, back to rejecting software patent applications.
    1. Re:In response to your question: by timeOday · · Score: 2, Interesting
      Speaking of which, do you think people would be allowed to drive cars or own guns if they were invented today? I don't.

      Anyways, the good news is that libraries do exist, and aren't going away. If the electronic library is to exist, it should be pursued as an extension of existing libraries. In other words, we must ensure that electronic access to text grows out of the familiar library setting, not Napster. There are lots of ways to do this.

      For instance, current library filing systems are really just electronic card catalogues, which is quite primitive - what if whoever catalogued the book didn't think up the same keywords you did? Only by digitizing the books will we be able to use all the information retrieval algorithms that make searching the WWW so effective. This would be very useful even if users couldn't "click through" the search results to the content of the book.

      Another good argument for digitization is preservation. It just seems reckless not to have an easily duplicated archive of all published works.

      After that, I hope we could consider exemptions to copyright that allow electronic access from anywhere, for a fee. Call it "compulsory licensing" if you like, but it really just means "we won't prohibit people from accessing the information, but we will make them pay and give you the money," which sounds better and happens to be true.

  2. wikipedia 2.0 by wizardforce · · Score: 1, Interesting

    so basically they are building a library that works a lot like Wikipedia but it is like an online library [creative commons I presume] how do they incorporate editing into the system without it having the same problems that wikipedia has? what does the project do that couldn't just as easily be done by expanding Wikipedia? any thoughts?

    --
    Sigs are too short to say anything truly profound so read the above post instead.
  3. Libraries don't get sued for infringement by Anonymous Coward · · Score: 2, Interesting

    Even in these litigation-happy days, physical book libraries don't get sued, and indeed they normally get direct governmental funding to continue their work.

    If an electronic library can find a way to obtain support as a literacy project, there are plenty of traditional avenues open. Suits against council literacy efforts don't go down well, at least in Europe.

  4. More Optimism, less cynicism by Anonymous Coward · · Score: 1, Interesting

    As an anonymous coward with little desire to register at this point I would like to say that such an Open Library should be labeled as the "wonder" of our digital age. All you cynics complaining about copyright are being too idealistic at this point (irony is fun). The website clearly stated that it will catalog information on where to buy or borrow (from brick libraries) the books it lists. This alone would be a great source, and even still many books WILL be offered online.

    Perhaps such a project would eventually inspire many publishers to "donate" their copyrighted materials under a special license that would let them retain the right to be the sole publisher of paper copies. I think many books 5 to 10 years after publication would probably receive this treatment as the peak sale period has already passed.

    Remember, even within a twisted and convoluted legal system the power is still with the people (in this case, most importantly the COPYRIGHT holders).

    -Gabriel

  5. Re:Project Gutenburg by Reziac · · Score: 2, Interesting

    I've been using the openlibrary.org site for a while now. I find these scanned original pages FAR more restful to the eye than any other form of electronic book. This way, I can sit down and read a complete book on the screen -- without suffering the eye fatigue that comes from reading large swaths of ordinary onscreen text. I think it has a lot to do with print fonts being designed specifically for the eye, and somewhat to do with the normal yellowing of paper that produces a less glary background.

    Also, many of these old texts, especially popular fiction from the late 1800s, have been discarded by meatspace libraries, so are otherwise pretty much unavailable -- and quite possibly in danger of being lost to the public altogether. (The first such book I picked at random to read, a late-1800s novel I'd never heard of, also proved to be a very relaxing way to spend an evening.)

    Anyway, I've been thrilled with the project, especially with the ability to download the scanned images as well as the plain text.

    --
    ~REZ~ #43301. Who'd fake being me anyway?
  6. Re:Take flight? by Reziac · · Score: 2, Interesting

    Actually, you're wrong -- to "take flight" primarily means to take off, or to start a project. So the usage was correct.

    --
    ~REZ~ #43301. Who'd fake being me anyway?
  7. Re:IPL? by TTK+Ciar · · Score: 4, Interesting

    OpenLibrary is a lot more complete, for one .. searching on "Ogorkiewicz" in IPL yielded no hits, while OL gave me several. The Archive is well-connected to various institutions like the Library of Congress and Bibliotech, and is able to pull a lot of help from these other organizations into making a more complete service.

    OpenLibrary is also a catalog of metadata, providing information for each book like physical format, publisher, ISBN#, number of pages, and so on. This metadata has a lot of holes for now, but hopefully that will change as publishers and/or people who own copies of these books fill in the blanks, much like the Internet Movie Database.

    Finally, OpenLibrary has its own staff which is dedicated to working with Internet Archive partners to make this the most complete catalog on the planet. IPL is cool (I like it!) but it does not seem to be very actively maintained.

    (disclaimer: I work for The Internet Archive, but I do not speak for it, and the OpenLibrary team is in a completely different department from mine so DO NOT treat this post as necessarily any more authorative or correct than any other slashdot post.)

    -- TTK

  8. Re:Project Gutenburg by Fallingcow · · Score: 3, Interesting

    What I really want are some modern, well-written footnotes and introductions to older works. Maybe throw in some good annotated maps when appropriate.

    Older books are often hard to relate to without some context, and that sort of thing is what makes or breaks many editions of the "classics", IMO. If, when shopping for books, I pick up a copy of a book that was written more than 200 years or so ago, and it has no foot notes, most of the time I won't buy it. This is doubly true of translated works.

    Wikipedia can usually stand in for an introduction, but there's nothing like footnotes to get you closer to an older text, and nothing that I know of provides that. If someone started a project to provide that kind of information for Project Gutenberg books, I'd get on board to help. Bonus points if they're also putting them in formats that don't suck (making plain text look good on the screen is a pain in the ass).

    I'd start it up myself, but alas, I am poor (college). I'd definitely help out if someone else got it going, though.

    Until someone does that, PG is practically useless to me.

    Will this project do anything like that, or do you know of anyone who's doing this?

    It seems to me that 500-1,000 really well-edited, footnoted, and formatted free books are better than 21,000 books worth of plain-text barf.

  9. I'm curious how they'll make money? by Anonymous Coward · · Score: 1, Interesting

    But I'm sure it'll come down to some banner ad/mining user data scheme. Books are old hat today, I've been cleaning house on reference and history books that are still useful if not the most current. This is also in direct contradiction to the way most librarians are seeing the world. They're gearing towards a future of information--it's all in databases and online sources, never mind books, even if condensed as online parcels on information, are still useful. The metadata/database descriptor field they're using seems to follow standard library format to a degree, the sort of stuff librarians require a master's degree to supposedly understand. Still, there's no catalog number system (Dewey Decimal, etc.) or seemingly any provision for serials. In this respect it looks more like a bookstore than a library.

    Lastly, in an age where the visits to libraries are increasing mainly to use computers, and budgets keep dropping and print collections suffer (notice how many still have science books from the fifties in them?) I wonder how will this will work since it's a private enterprise. My dream would be the Library of Congress becoming the online resource with all the books available or at least links to where you can buy OR borrow them, but that will likely never pass. Still, one can dream.....

  10. Vandalism controls? by Creosote · · Score: 2, Interesting

    First thing I did on the site was pull up an entry for a book my university press publishes. It had no "Buy" option. I edited the metadata to add the ISBN-10 number for it, and voila, a Buy option.

    It then took a certain amount of self-control for me not to go into various titles dealing with George W. Bush and enter the ISBN-10 of the storybook containing "My Pet Goat". Purely as a proof of concept, you understand.

    This is simply the Wikipedia vandalism problem writ large. What controls will OpenLibrary put in place to guard against it?