Google To Resume Scanning Books
SenseOfHumor writes "The Wall Street Journal is reporting that Google will resume scanning copyrighted books from Stanford and Univ of Michigan libraries. Let the battle resume!" From the article: "It isn't known just what percentage of library holdings fall into the category of being in copyright but out of print. About 18% of the books held by the libraries working with Google were printed prior to 1923 and are therefore in the public domain, according to an analysis by the Online Computer Library Center, a Dublin, Ohio, nonprofit library cooperative. An unknown percentage of the rest still are protected by copyright, depending on whether it was renewed. Google's resumption of its scanning of copyrighted works comes amid heated debate in the library community over participation in the program."
Is there some machine they have that separates all the pages and scans each one?
How do they verify that the items being scanned are being scanned properly?
We have secretly replaced these Slashdot mods' sense of humor with a rusty nail. Let's see if they notice!!
I'm not usually one to support the knee-jerk slashweenie response, that all intellectual property is theft, but I do object to commercial enterprises refusing to sell me something and insisting that I can't copy it.
It might be their intellectual property but it's my culture, dammit. If they won't keep it in print and sell me a copy, which I'm willing to pay for, then they should keep their mouths shut when I go and find one for myself.
Anybody got a DVD of Dance of the Vampires they can let me copy then?
until this is available! I can't count the number of times that I have been flipping thru a book and wished I could use the search function, only to realize that it doesn't work in meatspace. Searching is up to me, using old fashioned tools, like the table of contents or the index!
Curb CO2 emissions: Kill yourself today!
Would Google be allowed to store scanned copies of books even if the authors opt out? Someday, those print copies are going to be destroyed or deteriorate to the point of uselessness, which means that Google could be archiving works that might otherwise be lost forever.
I still don't get the uproar over the scanning, because it's not like the entire book is made available for free. The search is so crippled that it makes me think the people who are upset have never used it before.
Goo goo g'joob.
A lot of folks are going mental about the "copyright implictions" of google books, and I'm just laughing. On my bookshelf is a first-edition colleciton of George Bernard Shaw plays, printed in the UK in 1911. There's a legend on the inside cover that is a reference to the U.S.'s lack of copyright laws at the time: (paraphrasing from memory:)
FTA: "I feel that this is a potential disaster on several levels," said Michael Gorman, president of the American Library Association and university librarian at California State University, Fresno. "They are reducing scholarly texts to paragraphs. The point of a scholarly text is they are written to be read sequentially from beginning to end, making an argument and engaging you in dialogue."
The sad thing is, scholarly texts are so abundant nowadays that it's neigh impossible to keep oneself current with all the new things published. Already there are magazines that only (or mostly) contain abstracts or reviews of new dissertations and articles. I fail to see how Google Print is a greater disaster than this. If anything, it'll only improve the situation.
Man is a slave because freedom is difficult, whereas slavery is easy.
Seriously, though, I feel like I'm missing something here. What is it?
I cried real tears when Li Mu Bai died.
As a recognition of this debt to society, intellectual property that is not in the public domain should be taxed. Just as our other physical property is taxed, why not intellectual property?
And the taxes can be used to invest in new science, technology, and the arts.
This has the added benefit of also moving a bunch of stuff into the public domain.
If the taxes aren't paid within two years, then the item moves into the public domain. If you aren't sure on the status of an item, see if it has had IP taxes paid in the last two years. If not, then it's free!
Abstinence is a government conspiracy. www.SafeSexZone.co
1.Connect to google.
2.Download a rare book only found in a handfull of libraries.
3.Go read it....
Its great that a lot of the people opting out think this will save them, when all you need to do is visit local library and get entire book to read for free. Surely this is worse?
My colleague Jamie wrote the following letter to Wired yesterday regarding Lawrence Lessig's column supporting Google Print.
I think she makes some compelling points about the problems with Google's plan...
-------------
Lessig's Tough Call
In defending Google Print ("Google's Tough Call," issue 13.11), Lawrence Lessig and others overlook one thing. If the publishers and authors have no rights to prevent this, what rights does Google have to protect its own extensive efforts in creating this database? By their own arguments, the answer must be: none. Google does not own the raw data. In almost talking point fashion, Google, Lessig and others describe this as nothing more than a "card catalog." This description could come back to haunt Google, as the only thing they own is their original presentation of the data itself. And the image of a card catalog does not bring to mind "originality."
If the Google DRM is broken and I create my own "Jamie Print" index on the web... without Google's ads... what basis would Google have to argue? Google can scan a million books and by Lessig's arguments, that investment is irrelevant. If I find a way to download those million books from Google, store the data and use my own search engine, Google's supposed benevolence in creating this project will be hard to swallow amidst a flurry of lawsuits against my superior ad-free index. Google would have little basis to sue except under the DMCA, a statute whose very existence is vilified by Lessig and the very people defending Google Print as progress (and I don't care for it either).
If Google's investment in the project cannot be protected, they may have little incentive to create this and other projects. Isn't this much the same for the publishers and authors seeking protection for the right to control their work? Lessig defends Google Print in the name of progress, but progress is a careful balance of reward and public benefit. Google might not create Google Print if it cannot profit from the ads it inserts and publishers may lose out if they cannot choose how to profit from their properties.
It is almost inevitable that Google Print will be subverted and Google will seek the very same protections that it claims the publishers should not have.
Jamie Cole
New York, NY
I currently attend the University of Michigan, and have encountered much frustration in tracking down books. The university has several libraries and finding some books is nearly impossible. Additionally, the University has old collections and manuscripts that are barely indexed in the University's system. For the purposes of research, scanning the books is a dream come true. Searching for keywords, the ability to quickly find books, and the ability to view old manuscripts that one would normally need to be present at the library (and under supervision) to view. The copyright issue is important, but the books that are in public domain (primary sources especially) should definitely be scanned. As for the copyrighted books themselves, Google does not allow the full book to be viewed. If anything, Google advertises for these books. For a student such as myself, I would not buy the book as it is, so what is the harm?
What I find funniest about the entire copyright debate is how so few people are actually aware of what a flimsy basis copyright rests on.Intellectual property rights are not property, nor rights. They're grants (a decidedly un-libertarian form of state monopoly), given by the government, with the explicit intent of promoting the public good. Copyright holders are created for the good of society, not the other way around. The way I see it, once copyright starts being used to limit the creation and propagation of information and culture rather than encourage it, copyright might as well just not exist
Irritable, left-wing and possibly humorous bumper stickers and t-shirts
The fact that out-of-print books have copyright protection is further proof that Congress is more interested in hewing to the corporate line than adhering to Constitutional principles. How does preventing any further publication of a work for nearly 100 years promote the useful arts and sciences?
I would make copyright dependent upon making the copyrighted material available for the duration of the copyright. If it falls out of publication for a year and a day, then the copyright lapses. Making the material available online would be a cheap and easy way to maintain your copyright. Those that don't like that notion are free to publish and warehouse physical copies. In order to close an obvious loophole, I would further require that the copyrighted material be available at no more than the original cost, adjusted each year for inflation.
This has probably been mentioned in previous articles but oh well.
Many librarians I have met (not all, or even most, but some) have this weird mentality of "I am the gatekeeper of knowledge, you must have my leave to access the wisdom of the ages." The basically believe that knowledge is so sacred (it is) that only they are fit to gaurd it and distribute it (very not true).
When I was younger (elementry school, early-mid 90s) and you needed to research something you had to goto the library (either your schools or the public one), use a computer to look up a book (if you knew what it was) or (more often) ask a librarian to help you find books that would be useful for your topic. This gave the librarians great power because it allows them to deturmine all the information you are going to be using. When you learn and retain something, it becomes a part of you, by deciding what you learn they are in essense chaning you.
Now (for me, ever since middle school), you want to know more about ancient egyptian art? Google it and find 100s of pages of information (well, realistically you will only likely use about 10 of those pages but you get the idea). Want to know more about the 2000 US election? Google it. Before, if you wanted to find out information about certain topics (primarily recent or highly specific) then you were out of luck because often the libraries didn't have it. However, with things such as google and wikipedia, you now have access to almost any information you want from anywhere you have a computer with an internet connection.
(Beware, point soon approaching. Be prepared to duck)
Taking all this into account, it is not suprising that many librarians are reacting so harshly to this. They are all for making information more accesible but not if it doesn't go through them. Its like a company with a monopoly that it has had for ages: They've become used to the power and don't want to give it up.
The world has been slowly changing. It has become more and more difficult to control information. And as the cliche goes: Knowledge is power.
Speaking is NOT communication
"Scanning" of old books is typically done with a camera photographing a book lying in a cradle (to not split the binding). One image is taken of each page or every two pages (the latter is faster, but has focus problems).
Once photographed, OCR software grinds away. There are errors. Some projects proof-read the errors (this is very expensive), but with Google's volume they cannot. Even when not proof-read, however, the OCR'ed text has high value in search engines.
For examples of the resulting product, see U of Michigan's Making of America or the Library of Congress American Memory.
New, in-print books can be scanned destructively. That is, saw off the binding and feed into a sheet feed scanner. This works with publishers who have extra copies they can expend.
There is really no precedent for what Google is doing, so it has become a test-case for the limits of fair-use. We may all agree that it seems obvious that it is fair-use, in fact many lawyers have suggested just that, but until a court of law deems it fair-use Google will be challenged. It will probably go to the Supreme Court within a couple of years and we can only hope that the conservative justices being appointed by Bush will allow it under fair-use. Fortunately, Google has fairly deep pockets so may be able to win the case.