Slashdot Mirror


Book-Digitizing Robots

Makarand writes "Robotic digitization systems are the new help available to complete voluminous scanning tasks. Robots that can turn the pages of books and newspaper volumes and attain scanning speeds of more than 1000 pages/hour are now available. They even use puffs of compressed air to separate sticky pages!"

4 of 233 comments (clear)

  1. Scanned pages by Ed+Avis · · Score: 4, Interesting

    This story is a good opportunity to plug some free software you could use to help digitize books.

    Stuart Inglis's tic98 is a lossless compressor designed for black-and-white scanned documents. It achieves better compression ratios than anything else, or at least it did a couple of years ago. If you have scanned documents to make available online, it's fairly simple to write a CGI script to convert tic98 on the fly to PDF.

    Hopefully someone else will reply to this comment with a recommendation of good free OCR software.

    --
    -- Ed Avis ed@membled.com
  2. Hmm... by stratjakt · · Score: 3, Interesting

    What do the newspapers, and more likely magazines think of this?

    Now the magazine rack at 7-11 will show up on Kazoom and all that.

    I mean, comic books or "graphic novels" as the nerds call 'em already get traded freely, but that's because some joker with no life takes a day out of his life to scan and crop each page.

    But if you could just take the magazines, stick 'em in this robot, then share 'em, it could hurt the publishing industry the way it's hurt the recording industry.

    And everyone will justify it by saying "why should I buy a magazine when it only has one good article and the rest is crap!"

    So what measures can we expect to see? Lighter inks, crazier fonts to screw with the robots OCR? Funny paper that makes it hard to flip pages?

    --
    I don't need no instructions to know how to rock!!!!
  3. Re:Project Gutenberg by tempestdata · · Score: 4, Interesting

    Well I have some good news for you. While, I was working (and I still am actually) on this project I asked the Digital Library Projects Manager, who is basically in charge of this project about releasing the books they scan to the public. His reply was that they were probably going to release a pretty significant portion of the books they scan to the public. The rest would only be available within Stanford University Libraries.

    So, you may at one point see those books freely available for download, provided they can get those copyright issues ironed out.

    --
    - Tempestdata
  4. Destroying books to save them by shoppa · · Score: 3, Interesting
    The page-turning robots are unique because they do little (or no?) damage to the book to get them digitized.

    The more traditional way to preserve the contents of the old books is to destroy them in the process. Actually cutting the page out of the book lets you get a much higher quality scan because the page is then really truly flat. (Yes, there are correction techniques for turning scans of non-flat pages into flat "projections" but they aren't nearly as good as just ripping the page out and scanning it.)