Slashdot Mirror


Google Books As "Train Wreck" For Scholars

Following up on our earlier discussion, here's more detail on Geoffrey Nunberg's argument that Google Books could prove detrimental to academics and other scholars. Recently Nunberg gave a talk at a conference claiming that the metadata in Google Books is riddled with errors and is classified in a scheme unfit for scholarly use. This blog post was fleshed out somewhat a few days later in the Chronicle of Higher Education. Quoting from the latter: "Start with publication dates. To take Google's word for it, 1899 was a literary annus mirabilis, which saw the publication of Raymond Chandler's Killer in the Rain, The Portable Dorothy Parker, [and] Stephen King's Christine... A search on 'internet' in books written before 1950 and turns up 527 hits. ... [Google blames some errors on the originating libraries.] ...the libraries can't be responsible for books mislabeled as Health and Fitness and Antiques and Collectibles, for the simple reason that those categories are drawn from the Book Industry Standards and Communications codes, which are used by the publishers to tell booksellers where to put books on the shelves. ... In short, Google has taken a group of the world's great research collections and returned them in the form of a suburban-mall bookstore." The head of metadata for Google Books, Jon Orwant, has responded in detail to Numberg's complaints in a comment on the original blog post — and says his team has already fixed the errors that Nunberg so helpfully pointed out.

3 of 160 comments (clear)

  1. Something is usually better than nothing by Anonymous Coward · · Score: 5, Insightful

    And this is no exception. Before google books you had access to books from various libraries, books you owned, books you could loan from friends (*shock* *gasp* copyright infringement), books you could buy and books from non-google online sources. Now you have access to all of those and additionally google books. Even if google books is 99% "piece of shit" (which in my experience is simply not true, but nevertheless) you still have the 1% potentially useful material available that wasn't available before, so you win.

  2. Re:Obnoxious by Volante3192 · · Score: 5, Insightful

    Definatly. It's like, "Oh, look, I found an error. If I had done this, that error wouldn't be there!!" And to that I respond, then do it yourself. YOU go tack metadata onto the 100 million books they have, you smug egocentric bastard.

    And, of course, he completely ignores the 999,999 proper entries compared to the 1 error. Google seems to know there's lots of problems here, and they're not going to get it right the first pass. But having a first pass at all is better than nothing.

  3. Too much information? by presidenteloco · · Score: 5, Insightful

    Yes, having all of the world's literature available for instant full text search sounds
    disastrous for scholars.

    --

    Where are we going and why are we in a handbasket?