Internet Book Database?

← Back to Stories (view on slashdot.org)

Posted by michael on Tuesday April 16, 2002 @10:14AM from the readin-ritin-and-right-clickin dept.

Anonymous Coward writes "Just about everyone has used either the CDDB or freedb CD databases. And many people are also familiar with DVD Profiler, a well developed database for DVD fans. Each of these public databases have a number of wonderful strengths, and a few weaknesses, but they are well thought out and well developed. After searching Google, sourceforge and every other search engine I could think of, I have come to the conclusion that there is not a well developed internet book database. While many people would be quick to point out the various commercial websites (Amazon, Barnes and Noble, etc), and the various library databases (Library of Congress, Boston Public Library, and other online catalogs), none of these online databases offer the same ease of use of DVD Profiler, or the open structure of the online CD databases. The closest program I could find was the shareware program Readerware. This program will search several web sites and download the pertinent information, but it is extremely inefficient, as it does not then store the data in a central database to make it easier for the other users, and in my opinion, the UI is terrible. What programs, if any, do those of you reading /. use to keep track of your books? If you were to start an open source internet book database project, what features would you include in it?" Books in Print is the definitive book database; apparently it costs about $30,000/year to license it.

21 of 231 comments (clear)

Min score:

Reason:

Sort:

Wrote my own by smoore · 2002-04-16 10:21 · Score: 2, Interesting

Wrote my own Mysql/PHP. Not very good just enough to keep track of them. http://www.teuse.net/books

--
Shawn Moore http://www.teuse.net
1. Re:Wrote my own by ChazeFroy · 2002-04-16 13:42 · Score: 4, Interesting
  
  Carnegie Mellon University has been working on the ulib project for a number of years now.
  
  This is also a shameless plug for one of my IRC friends responsible for this. Hi Latinum.
What would be the point? by tongue · 2002-04-16 10:21 · Score: 2, Interesting

I fail to see the usefulness of such a database, outside the traditional search engine uses. CDDB and freedb both serve a function in that they identify some electronic data for me so I don't have to--a CD i've inserted into my drive. DVD Profiler presumably performs a similar function (I've not used it so I can't say for certain). But books don't have an analogue in this area. If you had an electronic version of a book, presumably it would also have whatever index you needed with it. And if you wanted an index across titles, you would use some search engine like google. But there aren't enough of these kinds of titles to warrant such an application, and i'm afraid I don't see the advent of that time approaching. Between incompatible proprietary formats and the DMCA, I think it'll be quite a long time before we have a standard "book cd" format that is used in generic book appliances a la' Rocketbook.
The Example of CDDB by Alien54 · 2002-04-16 10:22 · Score: 5, Interesting

I am concerned about the prior example of the CDDB, where all of these people contributed tyo this great resource, only to have the resource get sold off and commercialized and turned into a tool to track users, etc.
So While I really like the idea of the database, I do not like the possibility of the thievery of honest work by generous people.
Is there someway so that this could be donated into the public domain or something from day one?
(just trying to wrap my mushy mind around this for the moment.)

--
"It is a greater offense to steal men's labor, than their clothes"
1. Re:The Example of CDDB by ewhac · 2002-04-16 11:48 · Score: 3, Interesting
  
  Form a 501(c)(3) non-profit corporation that owns and operates the database. Draw up the corporate charter such that the database must be maintained for the sole benefit of the community, that users' activity will never be tracked, etc.
  My (limited) understanding is that the law makes 501(c)(3) charters very hard to change. As such, new management can't just waltz in and "sell out" the company and its resources.
  The only remaining danger is that the organization becomes politically influential and either leverages that influence to the detriment of the community, or itself comes under the influence of corrupt organizations.
  Schwab
  
  --
  Editor, A1-AAA AmeriCaptions
Cue::Cat by Roadmaster · 2002-04-16 10:25 · Score: 2, Interesting

Yeah, and let's enable the database so that you can point your cue::cat at the book's barcode and up pops the relevant page with information about which book you're reading.

Ain't it easier to just look at the cover??
Re:Have you tried the Dewey Decimal System? by Schwamm · 2002-04-16 10:26 · Score: 2, Interesting

I've actually arranged my books based on the Library of Congress call numbers. It worked rather nicely. Of course, it was a pain in the butt to do it at first. For some reason, you only ever come up with an idea for organizing something *after* you have too many to deal with quickly.
Readerware by chennes · 2002-04-16 10:27 · Score: 2, Interesting

I use Readerware, and while I grant to you that it is "inefficient" in some sense (and yes, the interface sucks), the folks working on it are continuously updating the thing, and its ability to search about 2 dozen different sources for book information is really wonderful. Since most people don't play books by putting them in a slot in their computer, there isn't really that much demand for a really high-power archiver. I personally just scan my new books in and click "update" - Readerware finds everything I need, no problem, and I don't have to do it that often. Chris
1. Re:Readerware by Anonymous Coward · 2002-04-17 05:28 · Score: 1, Interesting
  
  I love the fact there is a Palm version. I always take my PDA when I'm book shopping to prevent buying duplicates.
What's the Purpose? by rubinson · 2002-04-16 10:34 · Score: 3, Interesting

What programs, if any, do those of you reading /. use to keep track of your books? If you were to start an open source internet book database project, what features would you include in it?

What purpose would such a database serve? CDDB/freedb, for example, allow us to automatically download the album titles automatically. Saves everyone a lot of tedious work. Obviously, you're not going to be doing this for books.

As a graduate student, I maintain a single text file of all articles and texts that I've ever referenced. Each entry has a unique identified which I use the UIDs in my own articles instead of typing the full reference. A shell script then updates then updates the references and BibTeX automatically generates the bibliography.

I could see where it could be useful to have a centralized resource that could automatically download those references - but only if it was quicker/easier than typing it in myself (and that only takes a couple of seconds).

What other purposes would such a database serve? How would it make my life easier?
Like this? by danro · 2002-04-16 10:34 · Score: 3, Interesting

Is there someway so that this could be donated into the public domain or something from day one?

Maybe by making the source available under the GPL, and making the ability for different instances of the database to exchange information with each other be a part of the project?
That way anyone with a T1 and a fairly large disc could have his own bookDb.

That way, no single entity would be in exclusive control of the data.

On the other hand no two databasers would be exactly the same.
Hmm...
Database design is not my field really, maybe I should shut up, and just write a few frontends to the db once someone has dreamt one up...

--

"First lesson," Jon said. "Stick them with the pointy end."
1. Re:Like this? by danro · 2002-04-16 10:57 · Score: 2, Interesting
  
  Sure, but the problem is making sure the data is consistent.
  Just because the ISBN is in both the querying database and the databases it uses as a reference doesn't mean the entry contains the same data.
  And if it doesn't, how should the db know which post is the more correct?
  Not a trivial problem to solve, you can't have the databases trust each other too much since you don't want som lame script kiddie getting pleasure from injecting lots of false data and watching it spread...
  
  But, like I said, this is not my field of expertise, I'm sure there are a lot of people on slashdot that know a lot more about the subject...
  
  --
  
  "First lesson," Jon said. "Stick them with the pointy end."
Re:Would be good for small libraries worldwide by Anonymous Coward · 2002-04-16 10:55 · Score: 1, Interesting

Small libraries and the like can access OCLC. OCLC provides a definitive copy of the books record. Can you imagine what would happen if some one tried to enter in their own data? Not only do books which have the same title have different ISBNs the data being entered would be subject to the interpritation of the person entering it (eg St. vs Saint)

There are rules that need to be followed in order to maintain any sort of consistancy in record keeping. Remeber, a library isn't kept at all like your bookshelf.
Re:To keep track of my books? by Lemmy+Caution · 2002-04-16 10:56 · Score: 3, Interesting

Fair enough, the key value would be T/A/P/E (Edition information) + ISDN - but a system that returned T/A/P for ISDN when it's there and vice versa (or null when it isn't) would still be very, very helpful. Reducing data input by the percentage of books in a library that are 20 years old is still a gain.
Bookcrossing.com by chris_mahan · 2002-04-16 10:57 · Score: 2, Interesting

Check out bookcrossing.com. You can have your own bookshelf. Just type the ISBN, it retrieves the cover art, the author and all that. You can fix it too.

I use it, I like it.

--
"Piter, too, is dead."
Re:For items out of copywrite... by lkaos · 2002-04-16 11:02 · Score: 5, Interesting

http://www.gutenberg.org

is the official url IIRC

absolutely wonderful resource. they have a ton of books and the transcriptions are of pretty high quality--the have an excellent qa process.

--
int func(int a);
func((b += 3, b));
Writing my own by Thekim · 2002-04-16 11:04 · Score: 2, Interesting

I am writing my own catalog with MySQL/Perl for several reasons.

1) I don't have enough space in my tiny room to fit all my books into bookcases, but with the db I can put some books in boxes in the closet and easily find out in which box a certain book is.
2) I want my books sorted according to a standard classification system but still be able to have them in my own way in the bookcase. Currently I use a heavily outdated (1987) Swedish classification system that the kind folks at my school library lent me. So I'll definitely take look at the Dewey Decimal system mentioned earlier.
3) I have books in several languages and with a db I can have the same kind of information on different books in different languages in the same place. Thus I don't have to look up the romanization for the Kanji (Chinese charachters in Japanese) more than once. But of course it will store the original Kanji-titles as well.
4) I can easily create lists of books that I want to buy and, that friends have borrowed from me or books that I have borrowed.

When it's finished I want it to handle 2-bit languages in a nice way, be compliant with existing standards for book classification, both Swedish and international, allow for easy list creation and have a nice interface.
Project Gutenberg by jumex · 2002-04-16 11:05 · Score: 2, Interesting

Have you ever heard of Project Gutenberg? It is basically doing what you are talking about and has been since the 1970's. They have a pretty good collection, and I would totally suggest anyone interested in an internet book DB to help them out with their cause. Although I see your point that a full index of all books (without content) would be a pretty cool thing to have.

--
"Your 'Gin n'tonic Futon Brain' sure makes you smart!"
"That's 'Positronic-photon Brain', you idiot!"
Another use for a Cue Cat by ScottBob · 2002-04-16 13:22 · Score: 3, Interesting

So I scan or type in the ISBN, a perl script grabs the books information from the LOC(via z3950), and when I'm done, the system spits out a list of books in LOC order with the Title/Author next to it.

And what better to scan the ISBN with than a Cue Cat. My mother has about 400 paperback romance novels, and every time she goes to the bookstore, she can't figure out if she's read that book yet or not. She picks a book up, reads two pages, and says "I can't tell if I've read that one before or not." (Of course, I ask her how can she tell?) A Cue Cat and a CDDB style book database would allow me to scan the barcode and catalog every one of her books very quickly so she can bring a printout to the bookstore with her.
Re:*Here's* why we need this. by kiscica · 2002-04-16 14:12 · Score: 2, Interesting

Oh, I get it all right. I have more than 20 thousand books -- no idea of the actual number, that's just based on multiplying the number of packed-full shelves by the number of books on an average shelf. Many of them are old, as in pre-ISBN. Many of them were published in other countries and/or in other languages and don't show up in your typical database. I have numerous Hungarian books, for example, that aren't in the online catalog of any United States library.

I'm working on a catalog of my books (and my etexts, and my tens of thousands of physical and digitized sound recordings, and small quantities of miscellaneous other media -- I'm not really into video). Indeed, bibliography is an interest of mine, and I've long had ideas for very nontraditional, loosely-structured, multiply-hierarchical hypertextual catalogs. I've been implementing small parts of these ideas for over ten years.

But actually getting any reasonable fraction of my library into a database strikes me, on even my most optimistic days, as be a Herculean task. It's hard to get started, because when I do have any free time, I prefer actually reading the books to cataloguing them. Oh, when I actually get out of postdoctoral research hell and get a real job, I might have enough money to hire someone to do data entry (then again, I'm likely to want to spend the extra money on books -- fortunately I just got married and my wife might act as a braking force against that tendency).

With a little luck, I'll have the structural framework for my catalog coded in a year or two. But actually getting the data into a database will be a huge task, and one which my CueCat (or the more professional barcode scanners I recently dumpster-dived) will hardly begin to help with. (Only comparatively-recently published books have bar codes, and not even all of them).

A unified catalog with all the records from Library of Congress, Books in Print, and university/state libraries around the world would be fantastic, though, if only to "fill in the blanks" with a minimum of manual entry for any given book. (I do have access through my university to some things that help, though, the unified bibliographical catalogs that librarians use. But I have to write glue code to automate access to them, and that's a pain in the butt).

Why do I want to catalogue my library? Well, there are a couple of reasons. The main one is probably that I want to build the hypertextual database that I alluded to above. When I read books, I make notes (mentally or otherwise). The notes usually make reference to other books. It would be nice to record these notes in the database; eventually it would be a web reflecting what I've thought about various books throughout time. I'm a fairly disorganized person, and if I just jot something down somewhere I'll lose track of it. And if I try to keep it all in mind, I'll inevitably start to forget.

Being disorganized also justifies a catalog on purely practical terms -- it would be nice to know for sure, when for instance I see a book that I've already read and liked in a used bookstore, whether I already have the book (in which case I certainly don't want a duplicate), or read it somewhere else (in which case I certainly do want to buy it). And, since my books are not shelved according to any rational system, a catalog might help me find them (though I don't usually have much trouble with this). Note that I have no intention of significantly rationalizing the shelving even if I do catalogue the books. I'm much more likely to simply record my idiosyncratic locations in the database.

A final reason for cataloguing is that my collection is fairly comprehensive in a few specialized areas and I definitely do have a few books, at least, that would be very hard to find in this country. I'd be willing to lend out such books to (trustworthy) people. But people need to be able to find out that I have the books, and I need to be able to keep track of any loans as I'd be loath to lose even a single book. A catalog would be absolutely indispensable for this.

Kiscica
To keep track of your references by SgtChaireBourne · 2002-04-16 19:05 · Score: 3, Interesting
BIBTEX and MARC are two format for managing bibliographic data. But if you're thinking of rolling your own reference manager, then you'll quickly find out that it's not just a flat file and then you'll also need to integrate it with your data source and with your editor/wordprocessor.
If you just want to import citations, the Z39.50 search and retrieval protocol is the way to import from yor library catalog and many online databases. Indexdata has number of multiplatform tools that you can use, such as YAZ (a z39.50 client) and PHPYAZ. Three commercial packages import from Z39.50 sources nicely (Bookwhere, Procite and Endnote) both Procite and Endnot work well at managing your footnotes during workprocessing, taking care of numbering and layout (e.g. APA or Chicago Manual of Style, etc.).
If you want something under GPL and more oriented to managing web sites and other Internet resources, then you may want to try hypatia. You'll have to ask special for it, but it's available. Here are the parts I've seen so far:
- Web-based interface, both end users and maintainers.
- Fully multi-lingual, including both interface and content. (It is very easy to add another language to the interfaces. Right now English and Spanish are complete, Norwegian and Finnish are being translated.) Support for Unicode (Which means you're free to add interfaces in or ).
- Useable on many different platforms, including Linux, Unix, and Windows.
- Individual installations can exchange records, allowing federated content and service providers to work together seamlessly. (Haven't tried it yet.)
- Compatible with relevant standards, including MARC, Dublin Core, and the Networked Reference standard currently under development by NISO.
- Special features for digital collections, such as automatic URL checking.
- Authority control over names (e.g. People and Organizations).
- Uses perl/MySQL/javascript
You can see the end user interface in production at the IPL in the serials, newspapers, or online texts collections. The collection managment interfaces are even nicer and very useful. I'm sure it can be tweaked for data on legacy media as well.
--
Beta is broken and the link to classic doesn't work. Stop wasting our time or there won't be anybody left here.