Book-Digitizing Robots
Makarand writes "Robotic digitization systems are the new help available to complete
voluminous scanning tasks.
Robots that can turn the pages of books and
newspaper volumes and attain scanning speeds of more than 1000 pages/hour
are now available. They even use puffs of compressed air to separate sticky pages!"
I am not sure it would. It might turn them on to the idea of thinking for themselves, though. That could have interesting consequences. Unfortunately, just this very possiblity is threatening to those who are now profiting from their ignorance. These people are likely in a position to be gatekeepers for the dissemination of information.
But, having a robot do something which is enhanced by mindless repetition is a natural robotic application. Then having that application be something that could enable political liberation is a interesting twist of the old "robots in service to humanity" ideals. I'm not so sure that those holding the reins are going to be so interested in this--call me cynical.
What I would like to see is a similar device for converting analog recordings, in whatever form be at tape, vinyl, wax cylinders, to an open digitized format and then have those recording made available in like fashion. It might be just as interesting to turn those kids in Africa on to Mozart, or oral arguments from the Supreme Court.
The best way to do is to be.
After a long night of coding or sleeping for that matter, it is hard to focus on the text on the screen. Scrolling down is another matter, i end up putting text up to 200% zoom in Mozilla. So now we can all print out these digatized copies and read them. This is neat stuff sure, but reading from a screen is hard, and most people will print it out anyways. The good thing is that people can now download it from the net. Assuming it is hosted on a site.
OMG OMG OMG WTF OMG WTF BBQ STFU RTFM, OMFG OMG OMG OMG ROFL LMAO OMG WTF STFU ROFLMAO
Finally, Johnny-5 is coming alive!
Music wants to be free.
This story is a good opportunity to plug some free software you could use to help digitize books.
Stuart Inglis's tic98 is a lossless compressor designed for black-and-white scanned documents. It achieves better compression ratios than anything else, or at least it did a couple of years ago. If you have scanned documents to make available online, it's fairly simple to write a CGI script to convert tic98 on the fly to PDF.
Hopefully someone else will reply to this comment with a recommendation of good free OCR software.
-- Ed Avis ed@membled.com
Those people in #bookz on IRC are gonna be so excited about this...
What do the newspapers, and more likely magazines think of this?
Now the magazine rack at 7-11 will show up on Kazoom and all that.
I mean, comic books or "graphic novels" as the nerds call 'em already get traded freely, but that's because some joker with no life takes a day out of his life to scan and crop each page.
But if you could just take the magazines, stick 'em in this robot, then share 'em, it could hurt the publishing industry the way it's hurt the recording industry.
And everyone will justify it by saying "why should I buy a magazine when it only has one good article and the rest is crap!"
So what measures can we expect to see? Lighter inks, crazier fonts to screw with the robots OCR? Funny paper that makes it hard to flip pages?
I don't need no instructions to know how to rock!!!!
But does this passage puzzle you a bit?
"Think about the power of bringing our library to little schools in the middle of Africa," Keller said. "Would it make a difference for those who now have their minds closed to the idea of democracy?"
I'm not sure I get the connection:
Mbutu: Hey, Kwasa, check out this copy of "The Horse Whisperer" on my Palm Pilot.
Kwasa: Incredible! We must hold free elections immediately!
If your bitterest enemies are people who hack the heads off civilians, then I would say you're doing something right.
What do we need to do to get one of these donated to Project Gutenberg? Right now one of the biggest things holding them up is a lack of volunteers to manually scan the books.
Mechanik
All it takes is one *really* large project. If somebody like the Library of Congress started scanning/digitizing their collection (I know--subject/verb agreement :), it would obviate the need for just about any smaller libraries to do so. You don't need thousands of libraries to scan the same book, you only need one, and then you can replicate electronically. Surely there are specialty libraries around that have unique collections, but again--all you need is one...
I didn't RTFA, but this could be useful not only for developing countries, but as a "force-multiplier" of sorts for smaller community libraries. En masse digitizing of published works would allow smaller libraries to compete on a more even footing with larger ones, without having to invest loads of money into their collections and facilities to hold them.
Any well-heeled library patrons out there want to donate some money earmarked for one of these things to the large library of your choice?
This would be awesome for records/document archiving. I knew a guy who worked at our State Library who had to catalog courthouse records across the state. He'd go out to some remote county where all the marriage, land and court records were on paper and try to figure out what they had. Some of the records went back to before the American Revolution. In nearly all cases, the only records were on paper.
If he could drag this robot along to a courthouse and scan the records over a couple of weeks, it would allow him digitize that information quickly. Not only would the digital copies be easier to search, they would be easier to preserve. One courthouse, where their file room was in the basement, nearly lost all of its old records to a flood.
I'm glad they didn't go with the design where it licked its thumb before turning each page. I hate that!
"Tolerance is the virtue of the man without convictions." -- G. K. Chesterton
Time for a change in terminology.
I'm afraid a "puff of compressed air" ain't gonna unstick those pages.
M@
Krispy Cream is people
For the love of GOD, someone check this!!
blakespot
-- Heisenberg may have slept here.
iPod Hacks.com
The article says it would become cost effective for 5.5 million pages. Later it says it costs between $1 - $4 per book in the Far East. So if you estimate a book to have around 300 pages, doing the digitising manually would be $18333-$73333 per 5.5 million pages (ie 5500000/300 multiplied by cost per book). From the way article is written I expected it to cost ALOT more. I guess the proof reading cost for manual conversion could be high?
Not to long ago I had to do a research paper for a college class. No big deal, I've done many of them, and I was not looking forward to this one. Well, I went to the Houston Public Library in Downtown (which I hadn't been to in many many many , you get the idea, years). I got the library card that gave me access to some computer terminals and computer card catalogue. I was amazed about what they had converted electronically and links to other sites that had dictated material. I was also amazed that I could get all this same access from home using the information printed on the library card. So I go home (I have Road Runner cable modem) and do my research instead of being trapped in the library and get to work. I find electronic format of lots and lots of textbooks, magazines, government docs, and many many more. What put me a notch or two down from my high horse was that I even found that they had radio talk shows transcribed (which I used in my research paper) that helped a lot!
There is a lot of information ALREADY converted from text and audio sources at your fingertips that was unfathomable a few years ago. And all of this is free from the website (and links to other sources) from the public library. Talk about your one stop shop.
Using air to separate and move paper is not new. Heidelburg platen presses (you may remember them from high school graphic arts classes) have had this feature for about fifty years.
The more traditional way to preserve the contents of the old books is to destroy them in the process. Actually cutting the page out of the book lets you get a much higher quality scan because the page is then really truly flat. (Yes, there are correction techniques for turning scans of non-flat pages into flat "projections" but they aren't nearly as good as just ripping the page out and scanning it.)