Slashdot Mirror


Book-Digitizing Robots

Makarand writes "Robotic digitization systems are the new help available to complete voluminous scanning tasks. Robots that can turn the pages of books and newspaper volumes and attain scanning speeds of more than 1000 pages/hour are now available. They even use puffs of compressed air to separate sticky pages!"

21 of 233 comments (clear)

  1. Freedom 'Bots by rdewald · · Score: 5, Insightful
    I think there is a touch of naivete in this notion:

    "Think about the power of bringing our library to little schools in the middle of Africa," Keller said. "Would it make a difference for those who now have their minds closed to the idea of democracy?"


    I am not sure it would. It might turn them on to the idea of thinking for themselves, though. That could have interesting consequences. Unfortunately, just this very possiblity is threatening to those who are now profiting from their ignorance. These people are likely in a position to be gatekeepers for the dissemination of information.

    But, having a robot do something which is enhanced by mindless repetition is a natural robotic application. Then having that application be something that could enable political liberation is a interesting twist of the old "robots in service to humanity" ideals. I'm not so sure that those holding the reins are going to be so interested in this--call me cynical.

    What I would like to see is a similar device for converting analog recordings, in whatever form be at tape, vinyl, wax cylinders, to an open digitized format and then have those recording made available in like fashion. It might be just as interesting to turn those kids in Africa on to Mozart, or oral arguments from the Supreme Court.
    --
    The best way to do is to be.
    1. Re:Freedom 'Bots by Joe+the+Lesser · · Score: 4, Funny

      Would it make a difference for those who now have their minds closed to the idea of democracy?

      Are you talking about the US Government here?

      --
      "I only speak the truth"
      Karma: null(Mostly affected by an unassigned variable)
    2. Re:Freedom 'Bots by KrispyKringle · · Score: 4, Informative
      Interesting point. However, its useful to note that there are a lot of charitable and commercial corporations which currently fund (perhaps for the PR value rather than their own good intentions, and because the US dollar goes so far in most parts of Africa) technology initiatives and other educational programs. I've posted in the past about a program I'm involved in funded by a couple US coporations to put computers and networks in a West African university.

      In regards to your vinyl recording idea, couldn't you just hook up a record changer (yes, they do make these; they have a big spindle and an arm) to a DAT or similar digital recording device, and then use some audio software to cut tracks at blank space?

    3. Re:Freedom 'Bots by qoncept · · Score: 4, Insightful

      Wouldn't they need something capable of viewing these digitized formats first?

      --
      Whale
    4. Re:Freedom 'Bots by gurps_npc · · Score: 4, Insightful
      I think your concept of converting analog to digital is ridiculous.

      Analog by definition is ALWAYS readable. It is the SINGLE format that is by definiton OPEN, can always be understood by anyone, and can stan the test of time. Aliens could discover an analog recording 50 billion years from now and decode it without knowing ANYTHING else about our culture. But right now, data encoded 25 years ago in an open digital format is often incredibally hard to translate to a usable form.

      Digital requires people to understand the digital format. The ONLY advantage to it is quality via the suprression of unintended noises. But if we are copying something that started out as Analog, then the quality improvement is minimal at best.

      DO not blindly use Digital for things that Analof is far better.

      --
      excitingthingstodo.blogspot.com
    5. Re:Freedom 'Bots by Tackhead · · Score: 4, Insightful
      > Analog by definition is ALWAYS readable. It is the SINGLE format that is by definiton OPEN, can always be understood by anyone, and can stan the test of time. Aliens could discover an analog recording 50 billion years from now and decode it without knowing ANYTHING else about our culture. But right now, data encoded 25 years ago in an open digital format is often incredibally hard to translate to a usable form.

      Hey Glortzotnik! Check this out! These humans, they used lasers to inscribe little hills and valleys in aluminum discs 12" in diameter for video, then smaller hills and valleys in aluminum discs 5" in diameter for audio, and then they used lasers to start chemical reactions that changed the color of a dye later in big sloppy round holes with lots of fuzziness around the edges for video again.

      Okay, nothing wrong with that, but the funny part - get this - they called the laser paintings and the chemical dyes "digital", as if it were somehow different from scratching clay with a stick or a wax cylinder with a needle. Laugh riot, these humans!

      To a DSP engineer, everything is analog.

    6. Re:Freedom 'Bots by konch · · Score: 4, Insightful

      actually, Africans such as the Igbo people of Nigeria have always had democratic institutions. And most Africans I know are very well informed. The people who need to learn more about democracy are the Americans. They've got a long ways to go.

  2. Short Circuit by sin(theta) · · Score: 5, Funny

    Finally, Johnny-5 is coming alive!

  3. Scanned pages by Ed+Avis · · Score: 4, Interesting

    This story is a good opportunity to plug some free software you could use to help digitize books.

    Stuart Inglis's tic98 is a lossless compressor designed for black-and-white scanned documents. It achieves better compression ratios than anything else, or at least it did a couple of years ago. If you have scanned documents to make available online, it's fairly simple to write a CGI script to convert tic98 on the fly to PDF.

    Hopefully someone else will reply to this comment with a recommendation of good free OCR software.

    --
    -- Ed Avis ed@membled.com
    1. Re:Scanned pages by tempestdata · · Score: 5, Informative

      Actually, I've seen this robot operate in person and it is a work of art. The way the arms move makes you think its going to rip the book to pieces, yet some how it manages to pick up exactly one page( It detects if its picked up two pages and drops the extra page) and flip it.

      I was the lead developer for the software side that actually does the crunching on the images. However, I'm not sure exactly how much I am allowed to talk about it so I wont. Basically, the software side of it does produce PDFs, JPGs and TXT files from the OCR performed on the images.

      --
      - Tempestdata
  4. I'm all for democracy, of course... by CommieLib · · Score: 5, Funny

    But does this passage puzzle you a bit?

    "Think about the power of bringing our library to little schools in the middle of Africa," Keller said. "Would it make a difference for those who now have their minds closed to the idea of democracy?"

    I'm not sure I get the connection:

    Mbutu: Hey, Kwasa, check out this copy of "The Horse Whisperer" on my Palm Pilot.

    Kwasa: Incredible! We must hold free elections immediately!

    --
    If your bitterest enemies are people who hack the heads off civilians, then I would say you're doing something right.
  5. Project Gutenberg by Mechanik · · Score: 5, Insightful

    What do we need to do to get one of these donated to Project Gutenberg? Right now one of the biggest things holding them up is a lack of volunteers to manually scan the books.


    Mechanik

    1. Re:Project Gutenberg by tempestdata · · Score: 4, Interesting

      Well I have some good news for you. While, I was working (and I still am actually) on this project I asked the Digital Library Projects Manager, who is basically in charge of this project about releasing the books they scan to the public. His reply was that they were probably going to release a pretty significant portion of the books they scan to the public. The rest would only be available within Stanford University Libraries.

      So, you may at one point see those books freely available for download, provided they can get those copyright issues ironed out.

      --
      - Tempestdata
  6. Re:Great, but.. by Daniel+Boisvert · · Score: 4, Insightful

    All it takes is one *really* large project. If somebody like the Library of Congress started scanning/digitizing their collection (I know--subject/verb agreement :), it would obviate the need for just about any smaller libraries to do so. You don't need thousands of libraries to scan the same book, you only need one, and then you can replicate electronically. Surely there are specialty libraries around that have unique collections, but again--all you need is one...

    I didn't RTFA, but this could be useful not only for developing countries, but as a "force-multiplier" of sorts for smaller community libraries. En masse digitizing of published works would allow smaller libraries to compete on a more even footing with larger ones, without having to invest loads of money into their collections and facilities to hold them.

    Any well-heeled library patrons out there want to donate some money earmarked for one of these things to the large library of your choice?

  7. Archival Projects by borkus · · Score: 4, Insightful

    This would be awesome for records/document archiving. I knew a guy who worked at our State Library who had to catalog courthouse records across the state. He'd go out to some remote county where all the marriage, land and court records were on paper and try to figure out what they had. Some of the records went back to before the American Revolution. In nearly all cases, the only records were on paper.

    If he could drag this robot along to a courthouse and scan the records over a couple of weeks, it would allow him digitize that information quickly. Not only would the digital copies be easier to search, they would be easier to preserve. One courthouse, where their file room was in the basement, nearly lost all of its old records to a flood.

  8. Finger lickin good by dspfreak · · Score: 4, Funny
    They even use puffs of compressed air to separate sticky pages!

    I'm glad they didn't go with the design where it licked its thumb before turning each page. I hate that!

    --
    "Tolerance is the virtue of the man without convictions." -- G. K. Chesterton
  9. Book Ripping and Burning! by Dr.+Evil · · Score: 4, Funny

    Time for a change in terminology.

  10. Re:Digitizing Pr0n? by msheppard · · Score: 4, Funny

    I'm afraid a "puff of compressed air" ain't gonna unstick those pages.

    M@

    --
    Krispy Cream is people
  11. LORD - Dont you people see what's happening here?! by blakespot · · Score: 4, Funny
    I don't know about you, but when I see a robot latched onto one of humanity's tome's of knowledge, poring over it at 1000 pages / minute puffing and aiming its high resolution CCD, I see what is clearly the first step in the rise of machines which will lead to the utter anhialation of humankind!!! We can't just feed them our knowledge!!

    For the love of GOD, someone check this!!


    blakespot

    --
    -- Heisenberg may have slept here.
    iPod Hacks.com
  12. Does it cost that much? by zebadee · · Score: 4, Insightful

    The article says it would become cost effective for 5.5 million pages. Later it says it costs between $1 - $4 per book in the Far East. So if you estimate a book to have around 300 pages, doing the digitising manually would be $18333-$73333 per 5.5 million pages (ie 5500000/300 multiplied by cost per book). From the way article is written I expected it to cost ALOT more. I guess the proof reading cost for manual conversion could be high?

  13. Re:Hmm... by bob_jordan · · Score: 4, Funny

    " So what measures can we expect to see? Lighter inks, crazier fonts to screw with the robots OCR? Funny paper that makes it hard to flip pages? "

    I think you just described a typical issue of wired. Are they worried about people copying?

    Bob.