Open Library Project Takes Flight
Aaron Swartz today announced the launch of the new Open Library project. The goal of the project is to produce the world's greatest library on the Internet free for anyone to use. Starting with the Internet Archive's book scanning project and organizing the insertion of new content via a wiki-type model the project seems to be off to a great start. The demo, source code, and mailing lists were all opened up today in hopes of drawing interest from the public at large.
Project Gutenberg(sp) never really had a large enough selection to interest me. I would like to see how they do this new library.
God spoke to me.
Everything about this "Open Library" - from the colors to the fonts used - looks just like Project Gutenberg. Am I missing an important difference?
Perhaps this is going to contain books still under copyright? I doubt the full text will be available, which makes this "library" pretty useless.
Well, back to rejecting software patent applications.
Have these guys not heard of Project Gutenburg ?
It's been around for years, and I thought it was pretty well-known.
someone asked a good question on the website; how does this relate to Gutenburg?
http://www.gutenberg.org/wiki/Main_Page
they have a great collection of ebooks online already and your free to grab and share them. I wish that they would have the base for this though in a country which doesn't have insanely long copyright laws, then it could really add value over gutenburg
*''I can't believe it's not a hyperlink.''
As long as it is limited to rather dusty tomes that are "out of copyright" this is going to have limited, if not zero, value to most people. What exactly is the difference between Open Library and Project Gutenberg? Aren't they going to have 99% overlapping content?
so basically they are building a library that works a lot like Wikipedia but it is like an online library [creative commons I presume] how do they incorporate editing into the system without it having the same problems that wikipedia has? what does the project do that couldn't just as easily be done by expanding Wikipedia? any thoughts?
Sigs are too short to say anything truly profound so read the above post instead.
"Taking flight" normally denotes escape from a perilous situation, not emergence as is intended by the author.
Mod me down if you must but it's annoying when otherwise intelligent people cannot write a simple sentence and the editors are so lax in their responsibilities.
I must be new here.
People who understand both literature and libraries should be behind this project--not ego-geeks.
I know their intentions are good, but for these various online text-searchable book projects to be of maximum usefulness, they really need to be merged into one big project. Or, at the very least, a search engine needs to be set up that will search them all. Right now I basically just stick to Google Books, although I'm fully aware that the content I'm looking for but can't find is likely out there in one of the other few dozen open library projects.
Even in these litigation-happy days, physical book libraries don't get sued, and indeed they normally get direct governmental funding to continue their work.
If an electronic library can find a way to obtain support as a literacy project, there are plenty of traditional avenues open. Suits against council literacy efforts don't go down well, at least in Europe.
As an anonymous coward with little desire to register at this point I would like to say that such an Open Library should be labeled as the "wonder" of our digital age. All you cynics complaining about copyright are being too idealistic at this point (irony is fun). The website clearly stated that it will catalog information on where to buy or borrow (from brick libraries) the books it lists. This alone would be a great source, and even still many books WILL be offered online.
Perhaps such a project would eventually inspire many publishers to "donate" their copyrighted materials under a special license that would let them retain the right to be the sole publisher of paper copies. I think many books 5 to 10 years after publication would probably receive this treatment as the peak sale period has already passed.
Remember, even within a twisted and convoluted legal system the power is still with the people (in this case, most importantly the COPYRIGHT holders).
-Gabriel
Don't compare this to Project Gutenberg. This is the supposed to be the Internet Movie Database" for books (as far as I understand anyway). Anyway, I am pretty sure that a big part of this information can filled with calls to Amazon web services.
... here it is.
Or creative commons, like Cory Doctorow's work (which is on par with most similar fiction).
Or just old, almost like James Joyce's work, which arguably nobody reads, but for Joyce at least, a lot of people talk about it.
And as for getting stuff...at least for now, the experience of an ebook is a lot less enjoyable to most people than that of a dead tree book. Dead tree books have portability advantages as well. So if someone likes a book they find on Open Library, they might well buy it on Amazon.
How is this going to be different than the Internet Public Library? http://www.ipl.org/
Heard any good sigs lately?
The new site looks like it will host info on books and only provide downloads for some. It looks like a good place to find info (like the ISBN) of an old book you only remember parts of.
21st-Century-Citizen
She is correct. This is not a 'library' per se but a catalog of books, with links to PG, Amazon, B&N, etc. Most books are NOT free.
The difference between this and other catalogs (Library of Congress, etc.) is that presumably you can customize it more.
Where are the Pirate Bay kiddies on this? Wouldn't that fit their idea of all the information belonging to 'the people?'
Or does it only apply to stealing popular movies and music?
Obama likes poor people so much, he wants to make more of them.
is an error like this:
/search /1/pharos/code/production/pharos/infogami/tdb/tdb. py in remove_node, line 607
/ tdb/tdb.py in remove_node ...d b/lru.py in prune ...
<type 'exceptions.TypeError'> at
unbound method remove_node() must be called with LRU instance as first argument (got NoneType instance instead)Python
Web GET http://demo.openlibrary.org/search
Traceback (innermost first)
/1/pharos/code/production/pharos/infogami
...
node = LRU.remove_node(node)
▶ Local vars
/1/pharos/code/production/pharos/infogami/t
...
self.remove_node()
▶ Local vars
That was just my first try, but it doesn't really encourage me to try again.
...the future crusty old bastards are already drinking the Kool-Aid.
This is great news, I hope it actually works. Related: I recently discovered my local library has about 50% of the books I usually buy. Why didn't I think of this earlier? Must of lost about $10K from that during the last decade. Now, if you'll excuse me, I must go check out a copy of "How to Make a Your Very Own Video Game in 16 Days Using ONLY...Wordstar!"
I read and enjoyed A Portrait of the Artist as a Young Man. No need to speculate, people do read Joyce.
http://www.booksinmyphone.com/ lets you have the portability advantages of the dead-tree versions (better even as you are carrying the phone anyway) for a good selection of PD content.
Uhm, it's kinda already done with Project Gutenburg and Librivox. How is this different?
But I'm sure it'll come down to some banner ad/mining user data scheme. Books are old hat today, I've been cleaning house on reference and history books that are still useful if not the most current. This is also in direct contradiction to the way most librarians are seeing the world. They're gearing towards a future of information--it's all in databases and online sources, never mind books, even if condensed as online parcels on information, are still useful. The metadata/database descriptor field they're using seems to follow standard library format to a degree, the sort of stuff librarians require a master's degree to supposedly understand. Still, there's no catalog number system (Dewey Decimal, etc.) or seemingly any provision for serials. In this respect it looks more like a bookstore than a library.
Lastly, in an age where the visits to libraries are increasing mainly to use computers, and budgets keep dropping and print collections suffer (notice how many still have science books from the fifties in them?) I wonder how will this will work since it's a private enterprise. My dream would be the Library of Congress becoming the online resource with all the books available or at least links to where you can buy OR borrow them, but that will likely never pass. Still, one can dream.....
How about placing the servers somewhere where copyright law hold no sway?
Are there really any working data havens?
First thing I did on the site was pull up an entry for a book my university press publishes. It had no "Buy" option. I edited the metadata to add the ISBN-10 number for it, and voila, a Buy option.
It then took a certain amount of self-control for me not to go into various titles dealing with George W. Bush and enter the ISBN-10 of the storybook containing "My Pet Goat". Purely as a proof of concept, you understand.
This is simply the Wikipedia vandalism problem writ large. What controls will OpenLibrary put in place to guard against it?
One of the biggest problems remaining with these developing libraries is that we still have no good portable reader, at least in this case they want to support 'print on demand'. It hooks with the Internet Archive's book scanning project and their vision is fantastic.
http://www.europeana.eu/ ... but have also classic music)
http://www.liberliber.it/ (Italian language
http://www.gutenberg.org/
http://www.babelteka.org/ (Italian language)
I know the project is just starting, but here it goes.
They should republish the raw data the same way Wikipedia and even IMDb does. I for one am not going to contribute to any data collection project that I can't later use myself.
Their schema doesn't differentiate between editions. If I understand it right, that means that for the 3000 existing editions of "Tom Sawyer" released over the years, by different publishers in different countries and languages, the book's description has to be replicated for each one. That can't be good. I don't have a quick solution to this myself. Sometimes (esp. with tech books), a new edition changes content significantly compared to the previous one, sometimes they're exactly the same.
Collecting the cover images is a great service. However, doesn't this infringe on the publisher's copyright? Is this still fair use? What about countries like Germany without fair use laws--will German books still be OK because the data is collected in the USA (I guess)?
Add a feature to upload book descriptions as XML. Suggest a DTD. I have a list of my book collection stored as an XML file, so have others (maybe not natively, but book collection management software usually has an export function). It should be possible to automate the process of adding book information already stored in some digital format.
There should be some category system to pick from. Some may put Tom sawyer into "Novel, USA antebellum", others into "Novel, USA 19th century".
Somehow connect this to Wikipedia. The more prominent books have article pages. Maybe data could be retrieved from it as well. There are currently Tom Sawyer articles in 16 or so languages.
The edit page should group items better: stuff everyone understands (year published, title) first, then those things only specialists know.
The edit page's descriptors shouldn't be images but text which links to an explanation page for the same reason. BISAC? LCCN? UCC13? I know, I can find out what those are with a search engine, but I shouldn't have to.
Prepare for i18n. I guess LCCN is a library of congress code number? Those types of libraries exist in other countries, too. Each book can have a gazillion codes. Make this another tuple in the database: (book_id, code_id, code_value) instead of (book_id, lcc_id, isbn10, isbn13, 10 other codes in the same record).
Also i18n: store language codes with all textual columns. A description is most likely going to be Hungarian for a book published in Hungary in Hungarian.
This complicates the schema a lot. Having very few tables is tempting, but it usually doesn't work well with the real world.
My eyes... They failed to see what you were commenting on. The horror...
A lot of the confusion here arises from the fact that this claims to be a "library". A library is where you can borrow books. An online library would be something where you can download books. On their site you can't even read books. It's a (bookstore/library/etext) catalog at most.
Here comes the W.I.A.A. The Writing Industry's Association of America.
Internet: Serious Business
Mozilla'y browsers have the option, View -> Page Style -> No Style. If you don't like the result, use the browser's Preferences.
I sure would appreciate web designers not defining color and font properties for main content though, so my preferences can shine through. But alas, designers prioritize form over function.
I was thinking about annotated texts and the need for richer encodings of the works (that could be 'rendered' as plain text PG style files). I had looked at hgw but it seemed like there was not any recent activity (a 2001 story as recent news on the news page), and it seemed they were using html as an encoding of presentation.
It would be great to host a cleaned up structural representation of the book (author, edition, publisher, chapters, para, etc) and then allow people to layer annotations over it via some kind of wiki capability. Then support tagging of sets of annotations so that one could render/generate/export a version appropriate for a lit scholar or a different version appropriate for a 'casual' reader or a reader interested in how events in the authors life found their way into the book (say). It would be nice to allow groups of annotators to 'own' their sets, both to avoid edit wars and to allows a variety of consistent note sets. And of course you'd want to be able to render/export/generate to a number of representations html/pdf/ebook reader etc.
I could not find anything close to that capability does anyone have any pointers to something like that?
My (dead tree) book library weighs well over 1,000 kg - more than 1 metric tonne. I have paid to transport it across and between 3 continents for 30 years - and yet I still rarely have it where I want it when I want it - and it's always a little more damaged when I get it. This has frustrated me for years because I travel a lot and read a lot. Now, I have an old IBM ThinkPad that does the job perfectly. It's a dedicated portable library. I have installed MS reader plus ABC lit converter, + Adobe Professional and Openoffice. I can make any e book/text comfortable to read in about 5-10 minutes editing time. If you convert any doc to c5 size pdf - you have a reasonable paperback experience. MS Reader has brilliant type and a real book-sized viewing area with adjustable font size. While I agree that reading on CRT is useless, a decent LCD is no problem. My ThinkPad is about the size and weight of a many hardbacks. Type size, and back lighting are adjustable, and it rests on a small pillow for reading in bed, on the sofa or in a hammock. It currently holds around 20,000 books plus a decent reference library of several hundred volumes. I have read on this device daily for several years and now actually prefer it to the experience of reading many of the older original volumes I own, especially those with brittle yellow pages and fixed type sizes. Forget about the negativity of those who will tell you its not the same unless you can hold it and smell it and feel the texture of the cover. If you are a REAL reader and interested in books for their actual content - ebooks are fine. I promise you, those who take the plunge will never look back. The ebook is the future. Having said that however, there is still absolutely no way I would buy a dedicated reader locked into any particular format. Also, I will not have a DRM'd book on my hard drive and never register my copies of MS reader - if it doesn't work, or comes up with a copyright 'please jump through these hoops' request - it's recycle bin time!