Internet Book Database?

← Back to Stories (view on slashdot.org)

Posted by michael on Tuesday April 16, 2002 @10:14AM from the readin-ritin-and-right-clickin dept.

Anonymous Coward writes "Just about everyone has used either the CDDB or freedb CD databases. And many people are also familiar with DVD Profiler, a well developed database for DVD fans. Each of these public databases have a number of wonderful strengths, and a few weaknesses, but they are well thought out and well developed. After searching Google, sourceforge and every other search engine I could think of, I have come to the conclusion that there is not a well developed internet book database. While many people would be quick to point out the various commercial websites (Amazon, Barnes and Noble, etc), and the various library databases (Library of Congress, Boston Public Library, and other online catalogs), none of these online databases offer the same ease of use of DVD Profiler, or the open structure of the online CD databases. The closest program I could find was the shareware program Readerware. This program will search several web sites and download the pertinent information, but it is extremely inefficient, as it does not then store the data in a central database to make it easier for the other users, and in my opinion, the UI is terrible. What programs, if any, do those of you reading /. use to keep track of your books? If you were to start an open source internet book database project, what features would you include in it?" Books in Print is the definitive book database; apparently it costs about $30,000/year to license it.

20 of 231 comments (clear)

Min score:

Reason:

Sort:

To keep track of my books? by Schwamm · 2002-04-16 10:19 · Score: 2, Informative

I use a bookcase...

What would be the point of a book database? The databases for DVDs and CDs allow for players on a machine to spit out relevant track/title information. I'm having a hard time coming up with a reason to have a book database.
1. Re:To keep track of my books? by Venotar · 2002-04-16 10:50 · Score: 2, Informative
  
  ISBN based may be innefficient. If you're library has any real depth you run into the same problem that prevents used bookstores from using a similar system: ISBN's were only widely implemented about 20 years ago(some margin for error there - I don't recall the exact date). Title/Author/Publisher is the only way to id books predating ISBN's.
Have you tried the Dewey Decimal System? by Ummagumma · 2002-04-16 10:20 · Score: 3, Informative

I put all my books in order on my shelves, and make 3.5" index cards for each, organized by the Dewey Decimal System.

That way, when the power goes out, I can still find the right book by candlelight. ;)

--
"The natural progress of things is for liberty to yield and government to gain ground." - Thomas Jefferson
1. Re:Have you tried the Dewey Decimal System? by parc · 2002-04-16 11:19 · Score: 3, Informative
  
  I'm in the process of doing the same thing, but I hit on something nice: the computer's great at putting things in order. So I scan or type in the ISBN, a perl script grabs the books information from the LOC(via z3950), and when I'm done, the system spits out a list of books in LOC order with the Title/Author next to it. As I go down the list of books, it naturally gets put in the right order on the shelf.
  
  At least that's the theory. I'm over 200 books so far, and I've just finished about 1/3 of the books.
  
  This is gonna take some time.
singlefile by yum · 2002-04-16 10:25 · Score: 4, Informative

==>
1. Re:singlefile by gadfium · 2002-04-16 10:33 · Score: 2, Informative
  
  That's a pay service costing US$20 per year.
  
  From its website:
  
  Singlefile is an easy-to-use web-based service that helps you organize the books you own, the books you are reading, the books you've read and the books you want to read.
  
  You can use it to keep track of the books you've loaned to friends, or books you haven't bought/read yet, or to find out how many non-fiction paperbacks with 275 pages you own, etc. Singlefile is also great for keeping a record of your books for insurance purposes. And, in affiliation with Amazon.com, you can discover and buy new books you might enjoy based on the authors in your collection!
  
  I think a free service is what is wanted by the original poster!
bip ain't that great by brechin · 2002-04-16 10:29 · Score: 2, Informative

I used to work for the local (independent) college bookstore (Illini Union Bookstore), and we had access to Books in Print in both dead tree (very old) and web-based (shared a login with our university's library) formats. While the information was usually very good and very reliable, there were many problems.

Do you have any old books? BIP can be very unreliable when trying to find books published before 1980. Even still, BIP doesn't include information on all the different editions of a particular book, so your ISBN may not yield any results.

Speaking of no results, the search feature on BIP is incredibly unreliable. You can search for an ISBN, not find a book, then search for the title and come up with a book with the ISBN you just searched for. Try putting that ISBN back into the search box and it doesn't work! Sometimes you get what you want, sometimes you don't.

Aside from searching for basic bibliographic information (title, author, illustrator if any, publisher info, etc.), pricing and availability information (available for most books in BIP's database) are not up-to-date as they report them to be. Many times we ordered books and the publisher told us the books were priced very differently from what BIP told us. Good luck getting an accurate estimate of how much your book collection is worth!

In the end, a book database like cddb's cd database or even better, like imdb's movie database including reviews and ratings would help people organize and maintain their private collections, and would help bookstore employees get their job done. If only the book database software our bookstore used had the ability to access an outside database like that!
A start by JasonMaggini · 2002-04-16 10:34 · Score: 3, Informative

This site has some stuff on using barcode scanners (including the ever-popular cuecat) to catalog books...
I personally would like to catalog my collection with a relatively decent amount of information, but who wants to sit there and type all that stuff in?
I agree that the trick would to keep a database from going to the Dark Side like CDDB did...
Free Library Databases - and a protocol by outlier · 2002-04-16 10:38 · Score: 5, Informative

Most large university libraries have free (beer) databases that typically contain huge numbers of books (many that are not held by the library).

For example, see mirlyn.web.lib.umich.edu and sign in as a guest and you can do all sorts of searches.

These libraries typically use the Z39.50 standard to connect. Z39.50 is a pretty decent standard, and it is widely used, standardized, and allows you to connect to many many databases.

Sounds like this could be what you're looking for.
Re:What would be the point? by hoggoth · 2002-04-16 10:39 · Score: 5, Informative

Then just stick the ISBN numbers into MySQL, an Excel spreadsheet, or an Access database.
Then write a quicky Perl script to scan through the records and any that don't have all the information filled in, go scrub it off of Amazon's web site.
I've already written several Perl scripts that scrub data from Amazon. It's pretty simple.

(hint:

use LWP::Simple;
$page = get http://www.amazon.com/exec/obidos/ASIN/$isbn;
($d esc, $pgs, $price, $other) = $page =~ /use regex to find (desc) and (pgs) and (price) and (other) usefull stuff/;

)

--
- For the complete works of Shakespeare: cat /dev/random (may take some time)
Here are some useful links... by sailracer6 · 2002-04-16 10:43 · Score: 5, Informative

The UPC Database

You can add entries here for ANYTHING with a standard UPC, so some books are in here. Very useful.

The Book-Scanning Project

This guy wrote some Python scripts to convert UPC's to ISBN's - it can be done - and then feed them into Amazon's search engine. Very interesting, and he's already done it, so he has some experience.
Re:Would be good for small libraries worldwide by vtrhps · 2002-04-16 10:54 · Score: 2, Informative

There is already a company that provides just such a service: Online Computer Library Center from which libraries can buy bibliographic records to load into their online catalogs (or print for their card catalog). OCLC recently purchased NetLibrary, a provider of e-books. NetLibrary was having financial difficulties, and OCLC jumped in to make sure all those libraries who "purchased" these e-books would still have access.
Another source of Books in Print is through Gale Group. Many local libraries are purchasing access to the Gale Group databases (Books in Print, InfoTrac, etc) for their users. For instance, Virginia residents can type in the bar code number from their library card to get access to these databases from home.
I work in a library, but I'm not a librarian.
feed this page an isbn, get XML out by Pinball+Wizard · 2002-04-16 10:59 · Score: 5, Informative

if you are looking for XML data, feel free to use this page(asp at the moment, but it will soon be redone in perl).

The important thing is it outputs XML, so if you want to build an interface to it for your own application, you can. Its not a 100% complete database, but it should give you basic information on any book available.

I wrote this specifically for external search engines back when XML was the new hot thing. Funny thing is, the sites that search us usually want an FTP data feed, so this doesn't really get used much. But again, feel free(be reasonable if you use a bot - maybe limit your bot to a search every 5-10 seconds, please).

--
No, Thursday's out. How about never - is never good for you?
Use Z39.50 by Anonymous Coward · 2002-04-16 11:04 · Score: 1, Informative

Many large library databases are searchable with a protocol called Z39.50. There is a Perl module implementing this protocol (among many others). Check out http://perl.z3950.org/ for full docs. The reason you get back complex stuff when you do a search should be obvious if you ever read the cataloging information about a book in a library catalog. There's a lot of stuff there. If you're using this for making a catalog of your private library, do a "known item search", for example using ISBN.
Books in Print doesn't cost $30,000 by Anonymous Coward · 2002-04-16 11:20 · Score: 1, Informative

I was suspicious of this price because every little bookstore I go into seems to have online access to BIP, so checked on the Books in Print web site. They have a sliding rate, even a free trial. Unfortunately I couldn't get price details because their web site crashed my Netscape browser.
isbn.nu is useful by Anonymous Coward · 2002-04-16 11:47 · Score: 1, Informative

The nice folks at isbn.nu have a database you must check out. Try http://www.isbn.nu/0201563177 for example.
Re:Would be good for small libraries worldwide by makohund · 2002-04-16 12:21 · Score: 2, Informative

As a matter of fact... that is actually not that far off from how OCLC works.

From the site...

At the center of OCLC services is the WorldCat database, which:

* Is the most consulted database in higher education
* Holds over 47 million cataloging records created by libraries around the world, with a new record added every 15 seconds
* Spans over 4,000 years of recorded knowledge with 400 languages represented
* Includes 840,637,829 location listings

I'm not a librarian (I'm the sysadmin... the technical services librarian just left for the day or I'd just ask her) but I work in one and I believe the records are all submitted by member libraries.

Anyway, go to the site for more info. I gotta get back to work. :)

http://www.oclc.com/about/
Actually, there's work being done on one... by ExplodingTeakettle · 2002-04-16 12:28 · Score: 4, Informative

What you're basically proposing is a way to share bibliographic metadata -- not the book itself, but table of contents information, library holdings, etc. There are standards amongst libraries for doing this (ISO Z39.50 and AACR2--both of which are horribly abstruse and generally a pain to deal with). Dr. Rob Cameron, along with a small group of Simon Fraser University students, has been working on the seeds of a system for sharing bibliographic metadata -- see http://www.usin.org. This basically extends the URI standard to support ISBN and ISSNs, initially to support scholarly communication, but also making it possible to create what we call "personal bibhosts" with support for annotations, shared notes, etc. Among other things, we've implemented searches across various worldwide libraries to obtain and compare bits of bibliographic info, and so forth. Yes, you still run into the problems of inconsistent data for a given ISBN/ISSN (as a previous poster pointed out), but hey...you have to start somewhere!
I have one in development right now... by ChrisKnight · 2002-04-16 15:04 · Score: 3, Informative

I am currently building a database if ISBN numbers with the following records: Title, Author, Publisher and Media.

It hadn't really occurred to me that others might like access to this kind of data as well.

Seriously, is there enough interest that it might be worth the effort to add a request interface that returned an XML object of the data that I have? Would others contribute to it?

I currently have 294,652 completed entries in my database. I'm out of work and bored, and I'll make it publicly accessible if I get some feeback indicating that it would be worth the effort.

-Chris

--
-- This sig is only a test. If this were a real sig it would say something witty. --
Re:The Example of CDDB by Anonymous Coward · 2002-04-17 04:59 · Score: 1, Informative

I am also working on an open source project for a library catalog. I'm still in the investigation stages, although I've been mapping out the database in some early prototyping. This time I'm not taking it lightly.

As an example our collection is over 10,000 books way too many to catalog by hand. Although I've trieds several different times to start and keep a catalog going. It is a daunting task.

I know I'm going way overboard for what most people's needs are so after having reviewed many
of the book databases that are out there I came to
the conclusion I needed to try and write my own. AGAIN...sigh.

As a former librarian I feel the need to let people know that a good library database is a catalog of all type of items. Books, music, movies, all other types of media (print and non-print. I've even run into a few that catalog objects, especially works of art and mucical instruments.

There's been a lot of discussion of LOC and some mention of teh MARC record format. One should also consider the OCLC system which many colleges and universities belong to. From working with all kinds of funded catalogs I can tell you that it will be very difficult if not next to impossible to get one together that is "truely" comprehensive. Books have been around since long before Gutenberg's press and between various printings, editions, translations and the like the outporing of the printed word far outstrips the capacity of man.

But back to topic...sorry. My database is going to be for individual or private use. However I am planning ot set it up to go out and collect data to be brought back and stored for processing. As an example...go out to LOC and download the MARC record format of all the records for a particular author (batch process will be much easier than trying to do it by ISBN in our case), translate those MARC records to an XML record and then dump them into a database that is much more useful than flat file. That's the theory anyway. I haven't really started coding yet since I am determined to do it right and hopefully for the last time - this time. (yeah right - says the little voice in the back of my head)

Since this will produce more books than I own, I would then want to be able to check off the books I own and make lists of the ones I'm missng in my sets (oh did I mention that I'm a fanatic?) so I can go shopping. Ideally some thing that can be output to a simple format for shopping, not the whole ball of wax, onto a hand held. I love technology. I also want to keep track of those books I've read from borrowing (public library or friends), an if I want to own them or not.

The interface and the scripts I'll need, to do the job right, are going to have to be able to search several kinds of databases for LOC, to local and school libraries and oh yeah Amazon and B&N or Borders. And the ability to add new scritpting as desired when good sites are found. No one of these is going to have it all since my own collection includes books both pre and post ISBN. (BTW - be careful they do recycle ISBN and LC call numbers.)

I will probably be including all the fields from the MARC record format and then store that as a flat file record and pull the more mundane parts out to a relational set up.

This is all going to take quite a bit of time, so I don't expect to be done anytime soon. I want to structure it all in such a way that as technology changes I can plug in different parts that have better solutions too. GAD - what have I started? (whimper).

I will however keep an eye out for what other's have done in the meantime. Who knows maybe someone will make mine obsolete before I finish. That would be cool too.

Yikes! this has gotten way too long - I think I'd better stop now.