Google Engineers Open Source Book Scanner Design
c0lo writes "Engineers from Google's Books team have released the design plans for a comparatively reasonably priced (about $1500) book scanner on Google Code. Built using a scanner, a vacuum cleaner and various other components, the Linear Book Scanner was developed by engineers during the '20 percent time' that Google allocates for personal projects. The license is highly permissive, thus it's possible the design and building costs can be improved. Any takers?"
Adds reader leighklotz: "The Google Tech Talk Video starts with Jeff Breidenbach of the Google Books team, and moves on to Dany Qumsiyeh showing how simple his design is to build. Could it be that the Google Books team has had enough of destroying the library in order to save it? Or maybe the just want to up-stage the Internet Archive's Scanning Robot. Disclaimer: I worked with Jeff when we were at Xerox (where he did this awesome hack), but this is more awesome because it saves books."
FTFA: For the past eight years, Google has been working on digitizing the worldâ(TM)s 130 million or so unique books.
If these books are truly unique, you're taking a big risk subjecting them to this contraption.
Set your phasers on "funky"!
I work in the tech recycling business, but we get literally hundreds of tons of books turned in for recycling. It pains me to see most of them go to paper recycling recovery, though there is a growing market for shops that scan barcodes for resale. I would think that Google would have problems with copyright law, as would any single entity who is at risk of scanning the wrong book (i.e. the one someone would take time to sue you for, especially if you have deep google-pockets). This direction opens to small scale "wiki-scanning", which could be really ideal since people who have actually read the book would probably be the best ones to figure out if was worth the time to scan, would tend to prioritize important books (preserving them) and would present a very decentralized system for lawsuits. If I can scan the book for "personal use" like the cassette tape rulings for music, all the better. The problem is the physical space these books take, and its causing a lot of out of print books to get made into cereal boxboard, and the scale at which 50-100 year old out of print books are getting recycled is scary.
Gently reply
We know it can happen. Rome fell, Greece fell, Angkor Wat fell, Easter Island collapsed. Societies die just like we do.
It would be a shame to lose all of the knowledge, art, and literature that we have accumulated during our tenure so far.
Scanning books is a good way to archive much of that information for the next society that can develop digital computing. I suggest we enshrine it all in orbit or on the moon, guaranteeing it relative immortality and making it accessible only to those technologically advanced enough to benefit from it.
For all we know, the ancient Khmer civilization at Ankgor Wat invented advanced technology, and it's just lost merely to time.
We owe it to future generations to make sure our society does not lose as much when it collapses.
...I think it's fundamentally flawed in that it would not take much to have a misaligned page sliced right out of the book. Certainly nothing I'd risk a book of any value over. Sorry, this one appears to be a non-starter (although it is rather novel, pun intended).
I am guessing that this is the Google TechTalk video that is discussed in the summary, but not linked (or more likely edited out): http://www.youtube.com/watch?v=4JuoOaL11bw
But stone & clay slabs of the Sumerians and papyrus of the Egyptians survived until today, but the original data feed of the Apollo missions are lost forever because they were thrashed when no one had the equipment to read the old data tapes.
bickerdyke
Looking at that slashdot reference from 2003 it's fairly obvious how the comment quality declined since the Good Old Days (TM) (and /. still doesn't accept UTF-8!).
The summary questions Google's motivations for doing this, but I think it should be clear this isn't a Google project, really. 20% projects can't be totally random, personal things that have no relationship whatsoever with the business or possible business... but the link can be very tenuous, and the cooler the project is, the weaker it can be. All tech managers at Google are engineers themselves and tend to be just as able to geek out about cool stuff as the people they supervise.
Various other bits of obvious Google support for the project are also more incidental than planned. For example, Dany mentions that he built the machine in one of the on-campus workshops. Those workshops are there for "real" work, but they're also available for any employees to use on an as-available basis. Tech talks are also organized by and for the employees for their own interests, with basically zero "corporate" supervision. Most are actually job-related, but far from all. There are plenty of project talks and hobby talks (though this particular hobby/project talk is much cooler than most).
I imagine there was a cursory review required to get permission to publish the talk and the design, but such things tend to be handled on a "is there some really good reason we should say no?" basis. If not... go for it. Publishing cool, geeky things done by Google engineers is pretty positive for Google's brand, and it makes the engineers happy, which is good for employee retention -- especially since the kind of employees who do cool stuff for fun is the kind Google most wants to retain.
Bottom line: It's very unlikely anyone at Google has a corporate strategy built around the release of this information. It's just an engineer doing something he thinks is fun and valuable (to someone) and the company providing generic support for such activities, and otherwise staying out of the way.
Note to ACs: I usually delete AC replies without reading them. If you want to talk to me, log in.
Anyone for building the same thing out of Lego?
I'm not sure you're making a valid comparison. If I choose any particular piece of Egyptian recorded information then there's a good chance that it is destroyed. The fact that some material survived several millennia is both impressive and interesting, but very much material survives from the 60s even if some has been lost.
I mean how many records of the ancient Egyptian space race survive to this day? I rest my case.
I seem to have missed something. The image detector only seemed to be on one side of "V", so how does the page on the other side get scanned?
With all due respect, if 90% of the written material in existence today ceased to exist... the societies of the future might be better off.
Like this?
/me thinks you weren't really thinking of the book scanner design when you made that comment ;-).
Note that there's absolutely no relation between Book Scanners and Phone Design.
See archive.org...
This wasn't meant as a comparision of better and worse. Just as a set of specific risk for digital archives.
Go and try to read your letters from a 5.25'' floppy disc with your VizaWrite-files from just a few years ago. Wouldn't have happend with paper printouts.
On the other hand, go to a movie archive and see the first cellulose movies lost due to simply rotting away... wouldn't have happened with DVDs
Then again, if there's no DVD player left....
A form of archiving, that needs special knowledge (file formats) or devices (media) adds an additional long term risk. But of course it also greatly reduces other risks. (When the Amalia-library in Weimar burned down a few years ago, lots of invaluable books were destroyed. Google Books could simply restore an offsite backup). It's simply a tradeoff. as always.
And this is in no way a new problem. (Pulling this from the back of my mind, corrections welcome)
The Sumerians used soft clay slabs for rather unimportant, temporary stuff as it could be erased easily. "cultural treasures" like the Gilgamesh epic were stored on valuable parchment. Now guess what survived a series of fires..... Hint: we have tons of shopping lists to work with....
bickerdyke
This is the survivor bias that leads to conclusions like "they don't make them like they used to," not realizing that the fragile or poorly-constructed crap has largely been destroyed without a trace.
Gamingmuseum.com: Give your 3D accelerator a rest.
I'm waiting for a reference to the shredder-scanner to come up from Rainbow's End.
http://en.wikipedia.org/wiki/Rainbows_End (although the wiki article doesn't mention that piece of the plot, sadly)
They only have to add some feature that stores your scans in the cloud.
Google has a patent on using structured lighting to determine the shape of the page and correct the image ... is that open too?
Your paper print outs are likely on shitty, acidic paper. So are my 40-50 year old sci-fi novels and the pages are yellowed and frequently crack if I dare to read them. I dare say your paper print outs have, at best, double the life span of those floppies. The nice thing about digitized data is that it doesn't have to stay in one place, with incremental syncing it can live in a million places, accessible by a wide range of devices. The only thing we need to do is write the "manual", probably on some stone tablets.
We owe it to future generations now to let our society collapse? Why? Besides the obvious, energy. The dates are off, but it's close enough. Around 1900, we could get 100 barrels of oil for every 1 barrel invested. about 1970, that dropped to 30, by 1980, 17. Present day world, for new fields (the old saudi fields still have a high EROEI), its about 3. Everything we've accomplished has been because of a substance we're determined to suck every last drop out of the ground so (mostly) Americans can drive their 1.5+ ton cars from stop light to stop light. The same analogy goes for coal too.
If society collapses, we lose a lot more than just books and knowledge, we'll lose the energy investment we've spent so far to get us were we're at and have little to no low hanging energy fruit to get things jump-started. Is this a worst-case scenario, mostly, but we'll have a hard time getting back to anything we consider modern without replacements that are easier to transition to when you can use oil and coal to help you vs having to start over and re-invent many wheels.
Greece fell,
Oh, come on, Greece is still working on securing more loans, it hasn't fallen yet!
Yes, and you CAN print it out. And you CAN print it on good paper...
but what about the inks that you are using? I don't think those will survive very long. And getting better inks that will work with an existing printer is a real problem.
FWIW, I don't really have a much better answer than an improved clay tablet. And preserving anything that way is so expensive that it won't be done...except on a trivial scale. The original CDs were durable things, but that doesn't apply to the ones that you can burn at home. They use phase transition metals, which over time will relax back into the low energy configuration. Pits burned in metal foil and sandwiched between glass are much more durable. But both of those take specific technology to read. And that's pretty much guaranteed not to survive. Black and white (silver process) prints onto glass can be pretty durable, can be written as microfiche (the transfer to glass would occur as a printing process), and sandwich a sheet of glass over the image, so it won't be abraded. That would be pretty durable, fairly dense, and could be read with a decent magnifying glass. But it's not going to be done (again, except on a trivial scale). The equipment to produce the images would be very expensive, and you couldn't sell the results.
I think we've pushed this "anyone can grow up to be president" thing too far.
It's not that simple. (Nothing ever is.) Preserving information for the future runs into a lot of issues.
It's not enough to just store a copy on your hard drive. That only takes care of a few of the above cases related to the physical media. Previous-generation hardware is one problem, but you also have to have previous generation applications. In 2050, who is going to have a copy of Adobe Reader that can read the old virus-laden v4.0 formatted PDFs? If you want to read that old Word 1.0 file, you would have to have preserved a copy of VirtualBox that works on whatever hardware and OS exists in the future, plus a working installation of Windows 3.1, plus a working installation of Word for Windows 3.1. When you migrate your data, are you going to ensure that you migrate a tested museum environment along with it?
If you're not going to preserve all the needed ancient environment, you have two choices. You can either migrate the data, or you can dispose of it. It's a lot of work. Every generation of technology will require you to make that choice for each piece of your old data. Let's say that it's migration day today, and you decide that you can just copy the Word 2003 file without migrating it, because Word 2010 can read it. Do you know for sure that the next time you need to migrate that file that you will still have a program that can read Word 2003 files then?
And in 2050, will anyone still care?
John
https://www.google.com/search?q=helicopter+of+abydos&tbm=isch&source=univ&sa=X
Ancient Egyptian spacecraft & helicopters!
... If society collapses, we lose a lot more than just books and knowledge, we'll lose the energy investment we've spent so far to get us were we're at and have little to no low hanging energy fruit to get things jump-started. Is this a worst-case scenario, mostly, but we'll have a hard time getting back to anything we consider modern without replacements that are easier to transition to when you can use oil and coal to help you vs having to start over and re-invent many wheels.
This -- please mod up (no points left)
Adds reader leighklotz: "... Disclaimer: I worked with Jeff when we were at Xerox (where he did this awesome hack), but this is more awesome because it saves books."
That's not a *DISCLAIMER*, dammit! That's a *DISCLOSURE*!
it's an open design, dumbass. save yourself some time and don't build the spying part.
films that old don't necessarily rot. they either get eaten by fungus or burn on their own once exposed to ambient air. Nitrates were not an ideal material for making precious archival materials from...
optical discs are actually made in a near identical process to microfiche.
we could simply etch much much smaller using lasers on current replication hardware. you could probably write a small program that translates text files into an ISO file you could burn yourself that results in a human-readable disc.
hell, i want to try that. that sounds amazing.
IIUC, current consumer CDs and DVDs write using a phase transition process that changes the reflectivity of the metallic layer written upon. Over time this relaxes back into the low energy configuration. It may be good for a decade or two, but I doubt that it's even good over a century.
I think we've pushed this "anyone can grow up to be president" thing too far.
Karimunjawa, Pulau Tidung, Pulau Seribu, Blog Wisata Dan Budaya, WBL