Slashdot Mirror


Open Source Library Card-Catalog Apps?

dmd writes: "Does there exist Open Source software for maintaining a small to medium sized library card-catalog? It seems all the tools are available: a perl module for working with MARC records, several for working with Z39.50 and XML, and even a web site apparently devoted to nearly this exact topic. An actual, working, catalog, however, seems to be missing. Is this something that would be valuable? I, for one, have nearly 5k volumes in my collection, and they're begging for some discipline." I'm sure cash-strapped public libraries and schools would like to be able to use free / Free tools for this, since paper books aren't going away anytime soon. Not to mention for CDs, videos, charts, museum holdings ... any ideas out there? Turnkey solutions?

9 of 111 comments (clear)

  1. Great opportunity for a project by MochaMan · · Score: 3

    Actually, this is something I have been looking into for months. The reason being that most elementary/middle/high schools can't afford a decent catalogue system. Several that I have seen have been using software with (no kidding) CGA graphics interfaces, and searching by title only.

    Those schools that do have money move to software like Eloquent -- systems that are way more complex than a school library typically needs. Most schools don't need that much power/customisation, and can't afford it anyway. What seems to be needed is a basic system that offers searching on author/title/subject/keyword, and possibly uses MARC records (though for a school library this is not essential).

    It would have to be easy to set up, and low maintenance (ie. a basic linux box shoved under a desk somewhere with a UPS and a tape backup). You need to keep in mind that libraries -- and school libraries in particular -- are likely to have a multitude of machines running different OSes, so something like a web interface would be perfect.

    Considering the fact that most schools are getting networked these days, it's feasible to have a linux box sitting under a desk somewhere running a database, some library software, and Apache, and a bunch of Mac/PC clients running MacOS and Windows and interfacing to this thing via a web server. The checkout could be the same idea. This could be extended to have non-web clients running on various platforms and talking to the server via CORBA.

    In talking with librarians, I've found that you can't just say "dump MacOS/Windows and put Linux on all your machines" because they don't just use them for searching. They use them to run all sorts of stuff -- CD-ROM based educational software, etc. In other words, it's important to remember that for software like this, you can't just get a bunch of developers together and make decisions and write code. There are a ton of assumptions you just can't make when you're dealing with libraries and schools. There's a bunch of research into what people really want that's required. That makes it a little trickier a project than, say, a mahjongg game -- no offense to mahjongg hackers...

    Anyway, this is a fantastic opportunity for development, and one that I have been very interested in for a while now. It's also been on the GNU project's list of stuff to do for years now. Contributing a GPLed library system would be great not only for Free Software, but also for schools everywhere who can't afford decent software in their libraries.

  2. MySql couldn't do it right by Anonymous Coward · · Score: 3

    And I don't mean this to sound like a slam against MySql. No SQL database could do it in a way that a librarian would be completely happy with, primarily because of the wonderful MARC format.

    The MARC format is the standard format used to store biliographic information. It was originally created in the early 60's, with the idea that the primary means of transmission would be on tape. It supports well over 300 different major fields, ranging from simple ones that anyone would understand (auther, title, publisher) to arcana that only a trained librarian could love (is the item a festschrift, unusual pagination comments, magazine run dates, and on and on.) Most of the major fields have "sub-fields", where the data is broken into different elements (i.e. an author field field will have a name sub-field, a dates sub-field, a title-subfield, and possibly others.)

    Fields in the MARC format have a theoretical maximum length of 10,000 characters. Many of the fields can be repeated any number of times (co-authors, variant titles, subject headings). I've seen several attempts to model the MARC format in a relational model, and while it can be done, it's a royal pain in the ass and it inevitably winds up with trade offs.

    For a simple catalog, where you aren't worried about working with the MARC format, a relational database (including MySql) will be perfectly adequate. But librarians love the MARC format, and it is such a basic element of modern librarianship that any system that couldn't import and export it would be considered unacceptable - like a car with a crank starter.

    And I should know. I worked as a librarian for several years; I even have the MLIS to prove it.

  3. Input from a library geek. by dkh2 · · Score: 4
    Yes, by all means code it and make it fully Z39.50 and MARC compliant! However, there are other considerations you need to throw in.

    Commercially available library software that is actually used by libraries is much more than just a cataloging/look-up system to replace those old 3*5 cards.

    You need an acquisitions module that has the ability to do electronic ordering and approval plan processing.

    The search and report capabilities on the staff interface for these things is amazing. I can collect a list of all item records belonging to location X and created within [ range of dates ] that are attached to bibliographic records for [ material type ] within a [ call number range ], sort the records according to my criteria, then output selected fields from either the bib. or the item, or both, in the order I choose to the device of my choice (including print to e-mail or fax) and I haven't even begun to make the system sweat. Yes, this is a fairly straight forward thing to do (selecting records based on data spread across multiple related/linked records) in SQL but, you also need a front end that the end user can comprehend.

    If you're going to code it, it will need to be able to interact with all of the prevailing vendors... Ebsco, Baker & Taylor, Basil Blackwell, Swets & Zeitlinger, Matthews, etc... You will want tech contacts from each of these vendors to fine tune the ordering/receiving/approval interfaces.

    Finally, the amount of fiscal reporting done in libraries can boggle your mind. You would never suspect that something so seemingly simple could be so complicated. If you don't have the ability to generate financial reports you might as well go back to index cards and hand written ledgers.

    --
    My office has been taken over by iPod people.
  4. You're right by delevant · · Score: 3
    You're absolutely right about the customization problem.

    You can't just code a database -- that's almost entirely useless; there's also the matter of controlling circulation, tracking books out/returned/requested/held/sent to bindery, etc.

    Plus import & export from vendors, billing, accepting bill payments, cross-referencing, all kinds of freaky subject indexing, mondo-bizarro file formats from a zillion years ago (MARC), etc. etc. etc.

    There's a reason library systems tend to be proprietary -- it's because nobody else in their right mind wants to get involved with things like MARC and Z39.50.

    . . . but then again I could be wrong.

    --
    I have no .sig, and I must scream.
  5. simpler and more complex than you'd think by dchud · · Score: 4
    There are about three problems here (hopefully they won't moderate me down for this cuz I work for oss4lib.org :). The simpler bits have to do with the mindset of librarians: liberal about access, conservative about library collections. Since an online card catalog is about the collection, we librarian types tend to forestall any major systems overhauls until the last possible moment. And our systems vendors only have about a $500M business to sell to, so the general mindshare remains rivalrous, proprietary, dedicated to supporting legacy apps, and lacking overflow of hacker talent. Thus our systems generally suck and few are willing to admit it out loud.

    Second is that half of the pieces that go into a big library management system (including the catalog part) are really generic business systems: EDI, invoicing, accounting, etc., but they haven't been abstracted out of the realm of our systems vendors. So the level of standards followed there is minimal so those modules generally don't interoperate with our trading partners (i.e. internal payment systems and external suppliers). Lots of redundant keying and more crappy systems to maintain there, all of which is typically deeply and proprietarily tied into the catalog data.

    All that said -- and to our vendors' credit they are tending to get better these days -- we've been sharing catalog data like hackers are sharing code for over 100 years. We've been doing it online for about 35 years, but the way we do it now is pretty much the same way we've been doing it for those 35 years. i.e. largely dependent on one of two .orgs/vendors to be a clearinghouse for sharing catalog data. But those folks disappear if they can't sell the data back to us after we create it for them. So nobody running a library wants them to disappear. Especially because we've got to handle one-of-a-kind rare items in big research libraries as well as unusual local items in public libraries and so on.

    Imho the solution is to first outsource all the standard business stuff to vendors+free software that can do the same job with existing standards-based tools. Then abstract away as much as possible of the catalog data into free references sources shared and maintained by the library community (think: you could run your own amazon.com recommendations site, etc.). This is what we're trying to do (shameless plug alert) with the jake project for journals. Same thing applies for books, although there are probably >=100M records to normalize.

    If we can get that done, then anybody could hack up a gtk+ front end to the free, shared catalog, and pick and choose the items you have yourselves. It would work sorta like dict.org or jake. Just imagine how much easier it will be to search for ebooks in gnutella once this is done... :)

  6. Index Data by heikkile · · Score: 3
    Shameless plug: We at Index Data provide lots of tools you can use for this.

    - Zebra information server. Eats Marc (UsMarc, other local variants) as well as XML, mails, newsgroups, etc. You can add more input filters. Talks Z39.50
    - Yaz Z39.50 toolkit for client and server side
    - Zap web gateway and a PHP module for building easy search gateways to anything that understands Z39.50, for example our own Zebra
    - and more. Even more to come later...

    I am of course biased, but these tools are designed for library applications. All open source, at Index Data.

    --

    In Murphy We Turst

  7. I was looking into this once... by Alan+Shutko · · Score: 3
    I got a bit of code put together to import, save and edit MARC records, with a minimal GTK app. I wasn't looking at perl at the time, because I'm not a perl hacker.

    The major problem I ran into with writing something of the sort is that there's lots of information that you really want to have that isn't on the web. Cataloging rules, the full description of the MARC fields, some of the lists (organization, I think, is one example). I could get some of those from a library, but strangely enough although I'm sure most libraries have them, they aren't necessarily on the stacks, but in people's offices. Even then, I'd have to keep them checked out for long enough that I'd rather buy a copy.

    But, if anyone wants to work on it I'd be glad to help. My ideal app would have to

    • Import records from the LOC or your Z39.50 server of your choice, given eithe ISBN or title
    • Keep track of holdings information, so I can keep track of books I've lent out, and where I'm keeping said book.
    • Handle magazine article references, so my wife can use it to manage her references in grad school. There's a way to store that stuff in MARC records, although it's not used very often.
  8. MARC tape issues - giving away your tax dollars by Mrs.+Rod · · Score: 4

    Your tax dollars paid for all the cataloging at the Library of Congress.

    Unfortunately, some years back the firm that records these records in the MARC formats legally got control not only of their formatted tapes, but of any use of the information used after extraction from these tapes. In other words, they not only own the format, but the government funded information contained in the format.

    This is critical because these MARC tapes are the primary source of library cataloging information for most libraries. There are some other independent networks, primarily of educational institutions in the western US, but most libraries depend on the Library of Congress OCLC tapes.

    The whole thing stinks, and is ridiculous. As a former librarian, who also holds a BSCS, I was outraged at this theft of public assets. The worst part was dealing with my moronic former colleagues who screamed that of course this company should own this information - it was "intellectual property." Thousands of librarians wrote letters in supporting this company's "intellectual property rights" to work created at tax payer expense.

    This happened because most librarians think that putting information into a data format is some mystical arcana mastered only by brilliant wizards. They do not realize that the far more difficult part of the operation was the original cataloging done by the awesome catalogers at the Library of Congress.

    So, libraries pay for the nose for software. First, the fee that the vendor has to pay for using the MARC tapes, the royalties for the actual use of the data contained on the tapes, and then for the library software itself. BTW, most library software is so atrocious, buggy, and difficult to use that it's writers would receive a failing grade if it had been turned in as a senior project at any half way reputable college.

  9. Yes, there is: Koha. by vaxer · · Score: 5
    The Koha Open Source Library System might be useful to you.

    Public libraries, unfortunately, are too often dependent on fiercely proprietary-minded vendors for their daily operations.

    Incidentally, the "go get MySQL, you dumbass" posters are missing an important point: libraries use the MARC data standard for catalog records, and SQL doesn't cope well with the kind of tricks MARC can do.