How Google Book Search Got Lost (backchannel.com)

← Back to Stories (view on slashdot.org)

How Google Book Search Got Lost (backchannel.com)

Posted by msmash on Wednesday April 12, 2017 @03:20AM from the endangered-specimen dept.

Google Books was the company's first moonshot. But 15 years later, the project is stuck in low-Earth orbit, argues an article on Backchannel. From the article: When Google Books started almost 15 years ago, it also seemed impossibly ambitious: An upstart tech company that had just tamed and organized the vast informational jungle of the web would now extend the reach of its search box into the offline world. By scanning millions of printed books from the libraries with which it partnered, it would import the entire body of pre-internet writing into its database. [...] Two things happened to Google Books on the way from moonshot vision to mundane reality. Soon after launch, it quickly fell from the idealistic ether into a legal bog, as authors fought Google's right to index copyrighted works and publishers maneuvered to protect their industry from being Napsterized. A decade-long legal battle followed -- one that finally ended last year, when the US Supreme Court turned down an appeal by the Authors Guild and definitively lifted the legal cloud that had so long hovered over Google's book-related ambitions. But in that time, another change had come over Google Books, one that's not all that unusual for institutions and people who get caught up in decade-long legal battles: It lost its drive and ambition. Google stopped updating Books blog in 2012, and folded it into the main Google Search blog. The author reports that Google still has people working on Book Search, and they are adding new books, but the pace is rather slower.

13 of 46 comments (clear)

Min score:

Reason:

Sort:

Sigh. by ledow · 2017-04-12 03:29 · Score: 4, Insightful

Are they a shareholder-answerable business?
Does it make them money?
No? What did you expect?
This isn't surprising. It never took off like some other things, it therefore turns into an expense with little return (Do they charge a percentage of book sales found through their searches? Can they enforce that and stop you just taking the ISBN and buying from Amazon once you've found it?), so it will die when people lose personal interest in it.
The only things I can see staying any significant length of time are Google search and Google Apps. Everything else is just a boredom / filler project that can disappear like so many others, Google or not.
1. Re:Sigh. by Anonymous Coward · 2017-04-12 04:35 · Score: 4, Interesting
  
  >Are they a shareholder-answerable business?
  Yes, and they should cut executive pay. There are thousands of highly talented individuals that would do excellently at senior level management and board level positions for little to no pay. Am I using the 'asshole businessperson' logic correctly?
  >Does it make them money?
  You're posting on Slashdot, which *used* to be a haven for IT professionals, and the stereotype of 'IT only costs money, it generates no revenue' is still alive and well. But sure, you do you.
  >No? What did you expect?
  Exactly fucking this. They're not called 'moonshots' for nothing. Maybe you're just an ignorant asshole that knows nothing of the first Apollo missions, but go look those up.
  >This isn't surprising. It never took off like some other things, it therefore turns into an expense with little return (Do they charge a percentage of book sales found through their searches? Can they enforce that and stop you just taking the ISBN and buying from Amazon once you've found it?), so it will die when people lose personal interest in it.
  You need a better prescription because your shortsightedness is pretty bad. Let's see, Google could make a scientific publishing platform, combine that with search to prioritize and encourage people to check out their paid article/news service, which could have fact checking built in, and they could work out deals with the Associated Press to open up an online only news publishing wing. Google knows that there are certain people that *will* pay for things, and so they could use their vast inventory of information from books to complement/fill out services.
  Hell, they could take interesting random snippets from books that Google thinks are relevant to search terms, put some performance metrics on it, such as time spent reading quotes/text sections, or give people automated reports/answers to relatively complex questions, like 'what are the differences between x and y', and Google could compile a report with answers on it.
  >The only things I can see staying any significant length of time are Google search and Google Apps. Everything else is just a boredom / filler project that can disappear like so many others, Google or not.
  So that whole 'Android' thing didn't work out, huh? And mobile search just sucks donkey balls, right?
  And those self driving cars are just shit, right? And none of those patents will ever be useful, amirite?
2. Re:Sigh. by Anonymous Coward · 2017-04-12 06:21 · Score: 2, Interesting
  
  Being an academic researcher, a future where Google Books and Google Scholar did not exist would reduce scientific output by well over 50%.
  Everyone uses both to find citations and data in minutes. That would take hours and booking assistance with a librarian in the university library to accomplish the same task that needs to be repeated dozens of times for each paper.
  And that is assuming the university library has a copy on hand.
No surprise here, move along... by __aaclcg7560 · 2017-04-12 03:32 · Score: 5, Interesting

When I worked at the Google IT help desk in 2008, the building next door had all the book scanners. It was supposedly a miserable place to work at, low pay for flipping book pages, a relentless daily quota and a high turnover rate. Makes help desk support look like paradise.
Re:google books by __aaclcg7560 · 2017-04-12 03:38 · Score: 3, Insightful

"I haven't failed. I've just found 10,000 ways that won't work." - Thomas Edison, on the electric light bulb.
Re:They didn't automate page flipping? by __aaclcg7560 · 2017-04-12 04:11 · Score: 2

2) Automate page flipping for books that couldn't be spine-cut or sheet fed.
My understanding of the early book scanners was a chair that the operator sat back in to look at the overhead monitor. One button took a picture of the page, the other button flipped the page. If the book went out of alignment, the operator had to readjust it. The technology may have changed since then, as the human component was a big problem for the program back then.
http://hackaday.com/2012/11/16/google-books-team-open-sources-their-book-scanner/
Re:They didn't automate page flipping? by HornWumpus · 2017-04-12 04:19 · Score: 3, Interesting

I have a friend who is weird even by my social groups standards.
One of his 'interests' is preserving old DEC documentation. They just use a binding guillotine and a high speed sheet feeder scanner. Along with countless tricks to restore tape for one last read pass etc.

--
John McAfee 'It was like that time I hired that Bangkok prostitute; to do my taxes, while I fucked my accountant'
Re:They didn't automate page flipping? by HornWumpus · 2017-04-12 04:45 · Score: 2

90% of books that need scanning should be cut up.
Just not the books that Google borrowed from a library. Librarians are the people who can tell the difference, but I'm sure Google could come up with something to do 99% of the sorting (mostly, already scanned...)
What they really need are portable scanning solutions. LIbrarians are just the kind of people that would love to help, so long as their books don't go too far out of their control. Even absent that, most libraries produce a steady stream of 'discards' that should be checked against the 'books database' first.
Anybody should be able to take a picture of a title page and have Google tell them if they want the book for scanning. 'Book people' would do it.

--
John McAfee 'It was like that time I hired that Bangkok prostitute; to do my taxes, while I fucked my accountant'
Re:They didn't automate page flipping? by GuB-42 · 2017-04-12 05:03 · Score: 3, Informative

Google had to to it in the least damaging way possible. It was a necessary condition if they wanted libraries to cooperate.
Non-library books were processed destructively, by cutting off the spine.
Re:GB vs Project Gutenberg by tepples · 2017-04-12 05:21 · Score: 2

Project Gutenberg specializes in notable books that are more than three generations old.
Re:GB vs Project Gutenberg by Anonymous Coward · 2017-04-12 05:39 · Score: 2, Informative

No. Gutenberg makes text versions of fairly common books. You might think that they're uncommon, but as an academic who specializes in European books of the 15th-17th centuries, I can tell you that Google has found things that are absolutely miraculous. I've seen Google scans of books that exist in only four copies in libraries across Europe. I've seen whole sub-genres of literature that were thought lost suddenly appear on the internet. If you work in early modern literature, especially older forms of German and French or newer forms of Latin, Google Books and its associated HathiTrust project are a revolution, and the Gutenberg Project isn't even a blip on the radar.
Re:GB vs Project Gutenberg by Solandri · 2017-04-12 06:16 · Score: 2

Project Gutenberg scans books which are out of copyright, and only famous ones.

Google Books scans contemporary works. That in itself made it worth doing. Basically if the Library of Congress burned down, there would be millions if not billions of contemporary books and magazines which existed only on the authors' computers, and in printed form on collectors shelves. There would be no central database of these works, much less a searchable one. Regardless of what you think of Google Books or how boring it is to work there (I'm having similar boredom problems scanning dozens of my family's photo albums), it's a project well worth doing.
Time to free the data by hackel · 2017-04-12 06:42 · Score: 3, Insightful

Google Books always seemed like a great idea, but the idea of the search giant owning all of the data always made me incredibly uncomfortable. This data should be in the public domain. Authors should feel *privileged* to submit their works for inclusion in the database, not fighting it. It seems that, at least recently, Google Books has served primarily as a means to drive book *sales*. That's not an admirable goal. It's time for Google Books to be converted to a community-driven effort, like Wikipedia. Release all the data under a Free database license that ensures the data can not be used commercially and allow the community to help with the effort. This would be an incredible achievement for humanity in general. Oh well, one can dream...