Google To Resume Scanning Books

← Back to Stories (view on slashdot.org)

Google To Resume Scanning Books

Posted by Zonk on Tuesday November 1, 2005 @08:09AM from the love-of-the-digital-printed-word dept.

SenseOfHumor writes "The Wall Street Journal is reporting that Google will resume scanning copyrighted books from Stanford and Univ of Michigan libraries. Let the battle resume!" From the article: "It isn't known just what percentage of library holdings fall into the category of being in copyright but out of print. About 18% of the books held by the libraries working with Google were printed prior to 1923 and are therefore in the public domain, according to an analysis by the Online Computer Library Center, a Dublin, Ohio, nonprofit library cooperative. An unknown percentage of the rest still are protected by copyright, depending on whether it was renewed. Google's resumption of its scanning of copyrighted works comes amid heated debate in the library community over participation in the program."

22 of 257 comments (clear)

Min score:

Reason:

Sort:

I can not wait by w.p.richardson · 2005-11-01 08:13 · Score: 4, Insightful

until this is available! I can't count the number of times that I have been flipping thru a book and wished I could use the search function, only to realize that it doesn't work in meatspace. Searching is up to me, using old fashioned tools, like the table of contents or the index!

--
Curb CO2 emissions: Kill yourself today!
Good. by Gothic_Walrus · 2005-11-01 08:14 · Score: 4, Insightful

Books are fragile. They need to be preserved somehow.
Would Google be allowed to store scanned copies of books even if the authors opt out? Someday, those print copies are going to be destroyed or deteriorate to the point of uselessness, which means that Google could be archiving works that might otherwise be lost forever.
I still don't get the uproar over the scanning, because it's not like the entire book is made available for free. The search is so crippled that it makes me think the people who are upset have never used it before.

--
Goo goo g'joob.
1. Re:Good. by daniil · 2005-11-01 08:27 · Score: 2, Insightful
  
  Books are fragile. They need to be preserved somehow /.../ which means that Google could be archiving works that might otherwise be lost forever.
  Nonsense. Books are far less fragile than any of this digital crap. Drop a book, and nothing bad happens to it (unless you drop it into water). Drop a hard drive, and it's dead. All in all, it is more likely that digitally stored information will be lost forever than a book.
  
  --
  Man is a slave because freedom is difficult, whereas slavery is easy.
2. Re:Good. by MindStalker · 2005-11-01 08:45 · Score: 2, Insightful
  
  Yes, and no. Digital information while easily lost can be copied indefinatly. Joe shmo making a scanned copy of a book is more likly to lose their data before that book is destroyed. But we are talking about a consistantly archieved raided to the max system of storage. Googles storage of the huge amount of material from website caches to searching index is googles product. There is litterly billions of dollars invested in Googles data. If they wern't being discustingly obnoxious in their backup system, their investors would have a cow.
Hmm. by daniil · 2005-11-01 08:17 · Score: 5, Insightful

FTA: "I feel that this is a potential disaster on several levels," said Michael Gorman, president of the American Library Association and university librarian at California State University, Fresno. "They are reducing scholarly texts to paragraphs. The point of a scholarly text is they are written to be read sequentially from beginning to end, making an argument and engaging you in dialogue."
The sad thing is, scholarly texts are so abundant nowadays that it's neigh impossible to keep oneself current with all the new things published. Already there are magazines that only (or mostly) contain abstracts or reviews of new dissertations and articles. I fail to see how Google Print is a greater disaster than this. If anything, it'll only improve the situation.

--
Man is a slave because freedom is difficult, whereas slavery is easy.
1. Re:Hmm. by joelpt · 2005-11-01 09:17 · Score: 2, Insightful
  
  Well, this librarian just doesn't get it. He thinks that by making the texts searchable, that readers will consequently read just the 1-2 sentence snippets and that will be the end of it.
  
  In reality, readers will actually just locate the texts pertaining to their search, and then read those texts in their entirety outside of the search results page -- inasmuch as they choose to. Presumably, scholarly readers would choose to read the entire texts more often than others.
  
  Following his reasoning, we would all have incomplete understandings of many paperback novels which we've flipped through in a bookstore. Which may in fact be true, if we didn't read the whole book -- but we read all that we *wanted* to. And we will continue to do so with the emergence of Google Print.
Did I miss something? by Descalzo · 2005-11-01 08:17 · Score: 5, Insightful

Assuming Google won't make these scanned copies available to everyone to just read on their Palm Pilot, what is the big deal? I would think that it would be to the copyright owner's benefit to have their text in a database to be searched. If Google has my book in its database, and someone searches for text that happens to be included in my book, doesn't that make it more likely someone will buy it?
Seriously, though, I feel like I'm missing something here. What is it?

--
I cried real tears when Li Mu Bai died.
Book Tablets by GoodOmens · 2005-11-01 08:20 · Score: 4, Insightful

This will make those book tablets that never really sold a neat idea.
1.Connect to google.
2.Download a rare book only found in a handfull of libraries.
3.Go read it....
1923 - 1990: the gap years by G4from128k · 2005-11-01 08:23 · Score: 3, Insightful

The move to digitize out-of-copyright works combined with the natural movement of new materials to web creates a kind of internet information gap. Many things appearing after 1990 or so are now on the net in some form or other. And things that occurred before 1923 are in material now lapsed from copyright and will be on the net courtesy of Google. Things in between are too old to be on the net directly and too new to be out of copyright.
In 75 years, give or take, the gap will close for oldest years on. But for a while the internet will not have as much on a wide array content on pre-digital topics.

--
Two wrongs don't make a right, but three lefts do.
I love the fact by gregbains · 2005-11-01 08:26 · Score: 4, Insightful

Its great that a lot of the people opting out think this will save them, when all you need to do is visit local library and get entire book to read for free. Surely this is worse?
spread information by dingDaShan · 2005-11-01 08:35 · Score: 5, Insightful

I currently attend the University of Michigan, and have encountered much frustration in tracking down books. The university has several libraries and finding some books is nearly impossible. Additionally, the University has old collections and manuscripts that are barely indexed in the University's system. For the purposes of research, scanning the books is a dream come true. Searching for keywords, the ability to quickly find books, and the ability to view old manuscripts that one would normally need to be present at the library (and under supervision) to view. The copyright issue is important, but the books that are in public domain (primary sources especially) should definitely be scanned. As for the copyrighted books themselves, Google does not allow the full book to be viewed. If anything, Google advertises for these books. For a student such as myself, I would not buy the book as it is, so what is the harm?
Re:Out of print - fair game by Kelson · 2005-11-01 08:36 · Score: 2, Insightful

Yeah, that's one of those cases where the legality is clear, but the ethics are debatable. You're certainly not damaging their ability to "leverage their assets" since they aren't doing anything to leverage them, and if they'd just sell you a copy, you'd be quite happy to buy it. (Of course, this argument would fall apart laughably with physical goods.)

The saddedst stories are the ones about silent-era films left practically rotting in the studio vaults. If the studio doesn't think they'll get any money for it, they have no reason to maintain/restore the old film, but film historians, who have every reason to want the film preserved, can't get at it.

On the other hand, you occasionally hear singers, authors, etc. talking about that early album or novel that they wish hadn't been published, and joking that they want to buy up every surviving copy and burn it.

And then there's the question of material that has never been published in the first place...
Digital information is just too volatile by October_30th · 2005-11-01 08:37 · Score: 2, Insightful

Huh?
Have you ever thought about how much more effort it takes to destroy a book in comparison to the effort it takes to destroy its digital copy?
It's the same thing with all digital data: in a few centuries this era will be called the dark ages of information - most of the historical data (text, images, sound) will be lost because it was stored on media that just couldn't hack it. People are just too eager to store precious data in a digital form just because it is convenient.

--
The owls are not what they seem
Re:Out of print - fair game by Scrameustache · 2005-11-01 08:37 · Score: 2, Insightful

I do object to commercial enterprises refusing to sell me something and insisting that I can't copy it.
It might be their intellectual property but it's my culture, dammit.

Ah, but the owners of the Disney Vault disagree... and it so happens that they also own some key lawmakers.
They like creating artificial scarcity in order to raise the asking price. It pleases them. The law of supply and demand is a lot of fun when you controll the supply.

--
You can't take the sky from me...
Google[black]mail by matt+me · 2005-11-01 08:42 · Score: 2, Insightful

Though I personally believe what Google are doing is not ethically/morally wrong, they are most probably 'breaking' our unjust (injust?) copyright laws. The only reason they are 'getting away' with it is because they are the most powerful domain on the net. No-one dares mess with Google.

A law suit against Google is very bad publicity, and they could subtly drop your page rank and you'd never notice until the visitors stopped coming.. or even remove you completely.
Flaunt/Flout--From TFA by adavies42 · 2005-11-01 09:00 · Score: 3, Insightful

Mr. Gorman, who said the American Library Association doesn't have an official position on the subject, described Google's argument that Web users will be able to look at several snippets and then decide whether they want to buy or read the book as "ridiculous." Further, he noted that as a published author, he opposes Google's intention to build an enormous database that includes copyrighted texts. "It's a flaunting of my intellectual property rights," he said. [emphasis added]
If the president of the American Library Association doesn't know the different between flaunt and flout, I think civilization is doomed.

--
Media that can be recorded and distributed can be recorded and distributed.
-kfg
Re:Lessig's Tough Call by judmarc · 2005-11-01 09:05 · Score: 3, Insightful

Why would Google care? You can make "Jamie Web Search" right now and Google has no right to stop you. Go ahead, index the Web. MS and Yahoo have, and Google isn't suing them. It isn't the *data* that's the secret sauce, it's the *search algorithm(s)*. The very same is true of Google Print.
Re:Out of print - fair game by natophonic · 2005-11-01 09:12 · Score: 2, Insightful

I'm not usually one to support the knee-jerk slashweenie response, that all intellectual property is theft, but I do object to commercial enterprises refusing to sell me something and insisting that I can't copy it.

It might be their intellectual property but it's my culture, dammit.

I think I've figured out to how win the intellectual property wars... recast the battle as a culture war and you've got the support of knee-jerk conservative bush-apologists everywhere! ;)
The Kelly v. ArribaSoft argument by Anonymous Coward · 2005-11-01 09:25 · Score: 3, Insightful

One of Google's arguments is that a marvelous Ninth Circuit decision of a few years ago, one of the few to deal with Internet issues, gives them the right to do what they're doing. But Kelly v. ArribaSoft is a weak reed to rest Google Print's case on. True, there are parallels. The thumbnails of art and photos that Arriba was indexing and posting as the case went to trial are like the short excerpts that Google Print will use and the helpful Ariba quickly complied with opt-out requests, including those by the ill-tempered Kelly who sued.
Both the district and appeals courts stressed the service that Arriba was providing to everyone by linking to web sites where an artist/photographer had made his art available online. The thumbnail itself, the court noted, was of such poor quality as to be of no value, something that isn't always true of a quote. (And one of the worst ways to treat an author is to quote them out-of-context, a practice Google's scheme will encourage.) The artist/photographer had also chosen to post his work online, thus putting it on the market. The Arriba link took interested parties to a site where they could pay the artist/photographer for the rights to a usable image. Arriba was creating a win/win situation for everyone and, once the image was thumbnailed, the full version no longer existed at Arriba's web site. All those factors taken together were sufficient to make what ArribaSoft was doing legal.
But the Ariba/Google parallel only exists for books that are in print, being marketed online, and paying royalties to the copyright holder. Out of print, books that are only available used or through libraries do not parallel the AribaSoft case. What Google is much more like sending someone with a digital camera to art galleries and museums, ignoring any wishes of the artist. Owning a copy isn't owning the copyright. That's why the "approval" Google has from the libraries is so silly, as are its claims that it is simply making a 'really big' card catalog.
The reader may benefit. Google and whoever profits from Google's linking may benefit, but no royalties flow to the author due to the linking, nor has the author chosen (present tense) to place the book online or in the marketplace. He may, in fact, consider the earlier work so dreadful, he intends to use copyright laws to their full extent to keep down his embarassment. And despite the squawks of some posting here, we have no legal right to get easy access to what someone else has published. A copyright bestows the right to say, "No more copies will be published." That's why, for instance, an author can prevent anyone from making a movie derivative.
Arriba is a marvelous case for defending unauthorized linking and for indexing the web itself, as Google does. And as a Ninth Circuit appeals decision for someone living in the Ninth Circuit, it was "controlling" in my successful battle with Tolkien estate lawyers over whether my chronology of a fictional work was fair use--the law in that matter having been corrupted by some dreadful Second Circuit court actions in 1998.
But Arriba is weak precisely where Google is being challenged most strongly by authors and publishers--Google's right to scan and index the entire text of books that are in library collections but are, for the most part, are out of print. For those books, Google cannot link to a website where the purchase of the book will result in income for the copyright holder. That's the key issue. Google Print may be a winning situation for Google and readers, but the copyright holder doesn't get a cent. Indeed, the very point of Google's action is to blast ahead, not bothering to even look for authors because that would be too much trouble. Authors, on the other hand, are expected to go to the enormous trouble of tracking down every instance of the use of their material by Google and a thousand Google-clones, and opt-out of each individually.
And I might add that I say this as a "one Mac mini" author/editor/publisher who's placed virtually every
Google actually preserves these books by dennison_uy · 2005-11-01 10:02 · Score: 2, Insightful

I had this idea once that the only way I can make sure my important document withstands the test of time is to make multiple copies and distribute the copies across different locations (i.e. home, office, gf's house, etc). This is the same reason why I make multiple copies of an important document and post it in multiple locations over the internet.

Making a book available online is just about the same thing -- it only serves to preserve it, and what better way to do so other than making something like Google, with its many layers of protection against data corruption, its backup? It's the best thing to preserve your most treasured works other than spreading it via P2P.

How many times have I tried to open an old link, only to discover that it is already gone or most of the content has already changed? Well, good thing there's Google cache and the Wayback Machine as form of "backup".

--
Take off every 'sig'!
All your 'sig' are belong to us!
Re:Out-of-print books by jrboatright · 2005-11-01 10:13 · Score: 2, Insightful

So, Joe Autor writes a cool new book.

He sells publication rights to Acme Giant Publishing for oh say $3000.

Acme Giant publishing prints 1000 books, sells some of them, and pulps the rest.

One year and a day later, the book falls out of copyright, and Acme prints 10,000 copies and sells them, and Joe Author gets nothing.

What is wrong with this picture?
Re:You missed the truck she drove thru Google stan by kebes · 2005-11-01 10:44 · Score: 2, Insightful

The owners of the copyrighted works cannot be forced into depending on the discretion of a third party to protect their works, regardless of Google's assurances, or whether the owner ever heard of the ability to opt-out or not.

Yes, owners of copyrighted works CAN be forced into depending on the discretion of a third party to protect their works. That's life. If copyright holders are really so scared of their works being copied, they can lock them all up in a vault and never sell them to anyone. Then they are really protected.

A library has a bunch of books. They protect those books. I can go to a library, steal the books, make copies, and sell the copies. This is illegal. I don't think any court would honestly hold the library responsible. In fact, I don't have to steal the books. I can borrow them, take them home, make copies and violate copyright and the library is STILL not responsible. In fact, I can use the photocopiers INSIDE the library to do my dirty work. That's life. I broke the law, the library did not.

Now I admit that the Google database is a little bit different. But as long as each copy they are making is fair use, then they are allowed to hold the database. Copyright holders can't say "but what if someone steals it and makes copies!!??" If that happens, you can sue the thief/copyright-violator, but not Google (barring any obvious negligence etc.).

Now, is it fair in the first place for Google to make those copies, and let people search (but not view) them? That's a separate issue that the courts are looking into. I personally find that it advances society without compromising the copyright-holder's monopoly. Therefore, I think it's legal. I also happen to think it's the "right" (ethical, etc.) thing to do, for society.

if Google loses control of the data that they do not own, they have very little legal basis to protect it.

Indeed, if Google loses control somehow, it will be up to the actual copyright holders to pursue legal action and so on. That's life. Libraries are not responsible (unless they willfully encourage people to break the law), and so neither should Google be responsible (unless they willfully encourage people to break the law).