Slashdot Mirror


Google Engineers Open Source Book Scanner Design

c0lo writes "Engineers from Google's Books team have released the design plans for a comparatively reasonably priced (about $1500) book scanner on Google Code. Built using a scanner, a vacuum cleaner and various other components, the Linear Book Scanner was developed by engineers during the '20 percent time' that Google allocates for personal projects. The license is highly permissive, thus it's possible the design and building costs can be improved. Any takers?" Adds reader leighklotz: "The Google Tech Talk Video starts with Jeff Breidenbach of the Google Books team, and moves on to Dany Qumsiyeh showing how simple his design is to build. Could it be that the Google Books team has had enough of destroying the library in order to save it? Or maybe the just want to up-stage the Internet Archive's Scanning Robot. Disclaimer: I worked with Jeff when we were at Xerox (where he did this awesome hack), but this is more awesome because it saves books."

12 of 69 comments (clear)

  1. False economy by srussia · · Score: 4, Insightful

    FTFA: For the past eight years, Google has been working on digitizing the worldâ(TM)s 130 million or so unique books.

    If these books are truly unique, you're taking a big risk subjecting them to this contraption.

    --
    Set your phasers on "funky"!
    1. Re:False economy by vlm · · Score: 5, Funny

      The proper SQL statement would have been "DISTINCT" not a "UNIQUE" index, true.

      --
      "Science flies us to the moon. Religion flies us into buildings." - Victor Stenger
    2. Re:False economy by plover · · Score: 2

      He addresses that in the talk. Yes, this machine can fold or tear pages. But they talked to an archivist, and he said that scanning the books in this machine was less risky than not scanning them at all. If they're scanned, the information is preserved, backed up, spread around, and is then widely available. Any library book is subject to risk from the patron tearing or damaging the book, yet they still accept the risk of making them available.

      Besides, how much worse is the risk of possibly tearing a page as compared to a bulk scanner that saws the binding off the book, then feeds the pages through a sheet feeder? That one is guaranteed damaging.

      --
      John
  2. Very Good Wiki Direction by retroworks · · Score: 4, Interesting

    I work in the tech recycling business, but we get literally hundreds of tons of books turned in for recycling. It pains me to see most of them go to paper recycling recovery, though there is a growing market for shops that scan barcodes for resale. I would think that Google would have problems with copyright law, as would any single entity who is at risk of scanning the wrong book (i.e. the one someone would take time to sue you for, especially if you have deep google-pockets). This direction opens to small scale "wiki-scanning", which could be really ideal since people who have actually read the book would probably be the best ones to figure out if was worth the time to scan, would tend to prioritize important books (preserving them) and would present a very decentralized system for lawsuits. If I can scan the book for "personal use" like the cassette tape rulings for music, all the better. The problem is the physical space these books take, and its causing a lot of out of print books to get made into cereal boxboard, and the scale at which 50-100 year old out of print books are getting recycled is scary.

    --
    Gently reply
    1. Re:Very Good Wiki Direction by Sqr(twg) · · Score: 2

      If you're scanning to save physical space, you don't need this contraption. Just cut off the back of the book and put the pages in a regular scanner with a sheet feeder. (You can get an excellent one for about $400, including OCR software.)

  3. Harvesting knowledge in case of society collapse by concealment · · Score: 3, Insightful

    We know it can happen. Rome fell, Greece fell, Angkor Wat fell, Easter Island collapsed. Societies die just like we do.

    It would be a shame to lose all of the knowledge, art, and literature that we have accumulated during our tenure so far.

    Scanning books is a good way to archive much of that information for the next society that can develop digital computing. I suggest we enshrine it all in orbit or on the moon, guaranteeing it relative immortality and making it accessible only to those technologically advanced enough to benefit from it.

    For all we know, the ancient Khmer civilization at Ankgor Wat invented advanced technology, and it's just lost merely to time.

    We owe it to future generations to make sure our society does not lose as much when it collapses.

  4. Having looked at the design... by pongo000 · · Score: 2

    ...I think it's fundamentally flawed in that it would not take much to have a misaligned page sliced right out of the book. Certainly nothing I'd risk a book of any value over. Sorry, this one appears to be a non-starter (although it is rather novel, pun intended).

  5. Re:Harvesting knowledge in case of society collaps by bickerdyke · · Score: 4, Insightful

    But stone & clay slabs of the Sumerians and papyrus of the Egyptians survived until today, but the original data feed of the Apollo missions are lost forever because they were thrashed when no one had the equipment to read the old data tapes.

    --
    bickerdyke
  6. Google's motivation by swillden · · Score: 5, Insightful

    The summary questions Google's motivations for doing this, but I think it should be clear this isn't a Google project, really. 20% projects can't be totally random, personal things that have no relationship whatsoever with the business or possible business... but the link can be very tenuous, and the cooler the project is, the weaker it can be. All tech managers at Google are engineers themselves and tend to be just as able to geek out about cool stuff as the people they supervise.

    Various other bits of obvious Google support for the project are also more incidental than planned. For example, Dany mentions that he built the machine in one of the on-campus workshops. Those workshops are there for "real" work, but they're also available for any employees to use on an as-available basis. Tech talks are also organized by and for the employees for their own interests, with basically zero "corporate" supervision. Most are actually job-related, but far from all. There are plenty of project talks and hobby talks (though this particular hobby/project talk is much cooler than most).

    I imagine there was a cursory review required to get permission to publish the talk and the design, but such things tend to be handled on a "is there some really good reason we should say no?" basis. If not... go for it. Publishing cool, geeky things done by Google engineers is pretty positive for Google's brand, and it makes the engineers happy, which is good for employee retention -- especially since the kind of employees who do cool stuff for fun is the kind Google most wants to retain.

    Bottom line: It's very unlikely anyone at Google has a corporate strategy built around the release of this information. It's just an engineer doing something he thinks is fun and valuable (to someone) and the company providing generic support for such activities, and otherwise staying out of the way.

    --
    Note to ACs: I usually delete AC replies without reading them. If you want to talk to me, log in.
    1. Re:Google's motivation by ceoyoyo · · Score: 2, Informative

      "Could it be that the Google Books team has had enough of destroying the library in order to save it?"

      The Google Books team is not Google. It's a a group of people, some of whom built this non-destructive reader. It's quite likely these people, who probably love books, started by wondering if there was a way they could scan their content without damaging them physically, and decided to use their 20% time to figure it out.

      As for scanning books, that is most definitely a Google-the-company supported project, which they've put a lot of company resources into, including going to bat in the courts to defend the project.

  7. Re:Harvesting knowledge in case of society collaps by bickerdyke · · Score: 2

    This wasn't meant as a comparision of better and worse. Just as a set of specific risk for digital archives.

    Go and try to read your letters from a 5.25'' floppy disc with your VizaWrite-files from just a few years ago. Wouldn't have happend with paper printouts.
    On the other hand, go to a movie archive and see the first cellulose movies lost due to simply rotting away... wouldn't have happened with DVDs
    Then again, if there's no DVD player left....

    A form of archiving, that needs special knowledge (file formats) or devices (media) adds an additional long term risk. But of course it also greatly reduces other risks. (When the Amalia-library in Weimar burned down a few years ago, lots of invaluable books were destroyed. Google Books could simply restore an offsite backup). It's simply a tradeoff. as always.

    And this is in no way a new problem. (Pulling this from the back of my mind, corrections welcome)

    The Sumerians used soft clay slabs for rather unimportant, temporary stuff as it could be erased easily. "cultural treasures" like the Gilgamesh epic were stored on valuable parchment. Now guess what survived a series of fires..... Hint: we have tons of shopping lists to work with....

    --
    bickerdyke
  8. Shredder scanner by tnk1 · · Score: 2

    I'm waiting for a reference to the shredder-scanner to come up from Rainbow's End.

    http://en.wikipedia.org/wiki/Rainbows_End (although the wiki article doesn't mention that piece of the plot, sadly)