Internet Book Database?
Anonymous Coward writes "Just about everyone has used either the CDDB
or freedb CD databases. And many
people are also familiar with DVD
Profiler, a well developed database for DVD fans. Each of these public
databases have a number of wonderful strengths, and a few weaknesses, but they
are well thought out and well developed. After searching Google, sourceforge and every other search engine I could think of, I have come to the conclusion that there is not a well developed internet book database. While many people would be quick to point out the various commercial websites (Amazon, Barnes and Noble, etc),
and the various library databases (Library of
Congress, Boston Public
Library, and other online catalogs),
none of these online databases offer the same ease of use of DVD Profiler, or
the open structure of the online CD databases. The closest program I could
find was the shareware program Readerware.
This program will search several web sites and download the pertinent
information, but it is extremely inefficient, as it does not then store the data
in a central database to make it easier for the other users, and in my opinion,
the UI is terrible. What programs, if any, do those of you reading /. use to keep track of your books? If you were to start an open source internet book database project, what
features would you include in it?" Books in Print is the definitive book database; apparently it costs about $30,000/year to license it.
So While I really like the idea of the database, I do not like the possibility of the thievery of honest work by generous people.
Is there someway so that this could be donated into the public domain or something from day one?
(just trying to wrap my mushy mind around this for the moment.)
"It is a greater offense to steal men's labor, than their clothes"
In conjunction with a barcode scanner/CueCat, it could make it a lot easier to start private libraries. I have a couple thousand books, about a hundred of which I lend out at any given time. Be nice to make a catalog, and a freely available cddb-like ISBN-based system would make that a lot easier.
==>
My use for such a database is partly for to make sure I haven't already read something (I just love buying the same book twice because the cover changed), and for insurance reasons.
I want to be able to use a barcode scanner (or even type the ISBN by hand), and pull all the relevant information from a DB to my local machine. This is exactly the point of CDDB, as I see it.
If I don't have to enter all the information by hand for a CD, why should I have to do it for a book?
--jcwren (owner of about 2700 books)
What if you could scan it, it brings a copy of that record into your local database, prints out a book plate, and the can set up a borrowing schedule for it? That'd be cool. And very helpful for small libraries.
Some people ask `what is the point?'.
My answer to that is the following: It would be nice to be able to lookup info about a book, given a small amount of information. Suppose you are a library and you want to catalogue books. Instead of having to type in all the information yourself you could just type in the ISBN and all the information get downloaded to the local catalogue.
I have had to make a database and enter data for a library and that would make life a lot easier!.
Most large university libraries have free (beer) databases that typically contain huge numbers of books (many that are not held by the library).
For example, see mirlyn.web.lib.umich.edu and sign in as a guest and you can do all sorts of searches.
These libraries typically use the Z39.50 standard to connect. Z39.50 is a pretty decent standard, and it is widely used, standardized, and allows you to connect to many many databases.
Sounds like this could be what you're looking for.
Then just stick the ISBN numbers into MySQL, an Excel spreadsheet, or an Access database.
d esc, $pgs, $price, $other) = $page =~ /use regex to find (desc) and (pgs) and (price) and (other) usefull stuff/;
Then write a quicky Perl script to scan through the records and any that don't have all the information filled in, go scrub it off of Amazon's web site.
I've already written several Perl scripts that scrub data from Amazon. It's pretty simple.
(hint:
use LWP::Simple;
$page = get http://www.amazon.com/exec/obidos/ASIN/$isbn;
($
)
- For the complete works of Shakespeare: cat
You can add entries here for ANYTHING with a standard UPC, so some books are in here. Very useful.
The Book-Scanning Project
This guy wrote some Python scripts to convert UPC's to ISBN's - it can be done - and then feed them into Amazon's search engine. Very interesting, and he's already done it, so he has some experience.
The important thing is it outputs XML, so if you want to build an interface to it for your own application, you can. Its not a 100% complete database, but it should give you basic information on any book available.
I wrote this specifically for external search engines back when XML was the new hot thing. Funny thing is, the sites that search us usually want an FTP data feed, so this doesn't really get used much. But again, feel free(be reasonable if you use a bot - maybe limit your bot to a search every 5-10 seconds, please).
No, Thursday's out. How about never - is never good for you?
http://www.gutenberg.org
is the official url IIRC
absolutely wonderful resource. they have a ton of books and the transcriptions are of pretty high quality--the have an excellent qa process.
int func(int a);
func((b += 3, b));
So you also fail to see the usefulness of the Internet Movie Database? I, personally, visit the IMDb almost as much as I visit /.
What you're basically proposing is a way to share bibliographic metadata -- not the book itself, but table of contents information, library holdings, etc. There are standards amongst libraries for doing this (ISO Z39.50 and AACR2--both of which are horribly abstruse and generally a pain to deal with). Dr. Rob Cameron, along with a small group of Simon Fraser University students, has been working on the seeds of a system for sharing bibliographic metadata -- see http://www.usin.org. This basically extends the URI standard to support ISBN and ISSNs, initially to support scholarly communication, but also making it possible to create what we call "personal bibhosts" with support for annotations, shared notes, etc. Among other things, we've implemented searches across various worldwide libraries to obtain and compare bits of bibliographic info, and so forth. Yes, you still run into the problems of inconsistent data for a given ISBN/ISSN (as a previous poster pointed out), but hey...you have to start somewhere!
In the same way that people research CDDB and IMDB to see who has recorded what and who has directed what and who starred in what, people also want to see full author bibliographies, checklists for book series, edition lists (there are people who collect multiple editions and sometimes even printings of the same book). Then the hunt begins. There are more uses of catalogs and libraries than just adding metadata to your own collection. There's also research.
Take a look at the SFDB for an example.
Carnegie Mellon University has been working on the ulib project for a number of years now.
This is also a shameless plug for one of my IRC friends responsible for this. Hi Latinum.