Slashdot Mirror


On the Google Book Scanning Project and the Library We Will Never See (theatlantic.com)

For a decade, Google's enormous project to create a massive digital library of books was embroiled in litigation with a group of writers who say it was costing them a lot of money in lost revenue. Even as Google notched a victory when a federal appeals court ruled that the company's project was fair use, the company quietly shut down the project. From an article published in April this year: Despite eventually winning Authors Guild v. Google, and having the courts declare that displaying snippets of copyrighted books was fair use, the company all but shut down its scanning operation. It was strange to me, the idea that somewhere at Google there is a database containing 25-million books and nobody is allowed to read them. It's like that scene at the end of the first Indiana Jones movie where they put the Ark of the Covenant back on a shelf somewhere, lost in the chaos of a vast warehouse. It's there. The books are there. People have been trying to build a library like this for ages -- to do so, they've said, would be to erect one of the great humanitarian artifacts of all time -- and here we've done the work to make it real and we were about to give it to the world and now, instead, it's 50 or 60 petabytes on disk, and the only people who can see it are half a dozen engineers on the project who happen to have access because they're the ones responsible for locking it up. But Google seems to be thinking ways to make use of it, it appears. Last month, it added a new feature to its search function that instantly connects you with eBook data from libraries near you. From a report: Now, every time you search for a book through Google, information about your local library rental options will be easily available. Yeah, that's right. Your local library not only still exists, but it has eBooks, which are things you can totally borrow (for free) online! Before, this perk was hidden somewhere deep within your local library's website -- assuming it had one -- but now these free literary wonders are all yours for the taking.

27 of 165 comments (clear)

  1. for free by supernova87a · · Score: 3, Insightful

    Well, actually, isn't the problem that they want to sell it / use it for commercial purposes? If Google simply wanted to put this on the web for absolutely free, with no links to anything else, couldn't they?

    I thought it's only when you're trying to sell something that these issues arise.

    1. Re:for free by Chris+Mattern · · Score: 5, Informative

      I thought it's only when you're trying to sell something that these issues arise.

      You thought wrong. It's a widely held fallacy about copyright, though. Copyright covers any unauthorized reproduction of a work, whether it's for sale or not. The only exceptions are for parody or fair use (which means such things as small quotes in a review of the work).

    2. Re:for free by Geoffrey.landis · · Score: 4, Insightful

      As an author, yes, I would like to be paid when my works are distributed.

      The problem is that Google wanted to distribute the work from authors for free.

      I do know that the idea that people should be paid for their work is controversial on /., where many commentators believe that information-- meaning other peoples' work-- should be free, and authors should be happy to starve, because, hey, it's exposure.

      Well, actually, isn't the problem that they want to sell it / use it for commercial purposes? If Google simply wanted to put this on the web for absolutely free, with no links to anything else, couldn't they?

      Google is the most valuable company in the world. They may want to distribute others peoples work for free, but they themselves plan to make a huge profit from doing so.

      It's merely the authors who don't get paid.

      --
      http://www.geoffreylandis.com
    3. Re:for free by Anonymous Coward · · Score: 3, Insightful

      Copyright length is the main issue, not a differing business model. There's a lot of content out there that the author's are dead and income are the least of their worries.

    4. Re: for free by Robotech_Master · · Score: 2

      No, Google never wanted to distribute those works for free. (Except the public domain ones.) That was the Authors Guild's idea.

      See my comment further down the thread, and the link therein.

      --
      Editor Emeritus and Senior Writer, TeleRead.org
    5. Re:for free by 140Mandak262Jamuna · · Score: 4, Insightful
      You must be a goblin.

      To a goblin, the rightful and true master of any object is the maker, not the purchaser. All goblin-made objects are, in goblin eyes, rightfully theirs

      "But if it was bought —

      then they would consider it rented by the one who had paid the money. They have, however, great difficulty with the idea of goblin-made objects passing from wizard to wizard. You saw Griphook's face when the tiara passed under his eyes. He disapproves. I believe he thinks, as do the fiercest of his kind, that it ought to have been returned to the goblins once the original purchaser died. They consider our habit of keeping goblin-made objects, passing them from wizard to wizard without further payment, little more than theft.

      --
      sed -e 's/Chuck Norris/Rajnikant/g' joke > fact
    6. Re: for free by easyTree · · Score: 2

      Uhh..., "All known knowledge" -> "All knowledge"

    7. Re: for free by easyTree · · Score: 2

      How much do (you believe) I owe you for reading your comment?

    8. Re:for free by rgmoore · · Score: 2

      Parody is actually a form of fair use; it's legally considered a form of criticism, which is one of the things fair use is intended to protect. Fair use is actually a very complicated legal issue that has to be decided on the particulars of each case. It depends on a balance of four different factors (by statute):

      1) the purpose and character of the use, including whether such use is of a commercial nature or is for nonprofit educational purposes;

      2) the nature of the copyrighted work;

      3) the amount and substantiality of the portion used in relation to the copyrighted work as a whole; and

      4) the effect of the use upon the potential market for or value of the copyrighted work.

      So commercial use definitely weighs against a ruling of fair use, but isn't an absolute factor. Courts have worked with these principles and like to talk about "transformative" nature of the work that's doing the copying. If the new work is something radically different from the original, it's far more likely to be considered fair use.

      My impression is that Google won its case by arguing that they were only showing selected snippets of the works they had scanned (helps on point 3) and that it should actually help the potential market for the copyrighted works they were copying (helps on point 4). The idea is that by showing only short snippets, they were making people familiar with the original work while not providing them with enough of it to be really useful. That should make people more, not less, likely to buy the original.

      --

      There's no point in questioning authority if you aren't going to listen to the answers.

    9. Re:for free by careysub · · Score: 4, Insightful

      The key piece of this picture that no one (yet, in any of the comments posted thus far) as even mentioned that what we are talking about are books that are out of print. These are books that you cannot buy (unless you can find an old copy, and may be exorbitantly expensive if so), and make the author no money at all. Zero.

      This is about 25 million books. Further it is estimated that half of these books are out of copyright under every iteration and perversion of copyright law and thus are already in the public domain - they belong to the public as is and was the intent of copyright law from the beginning.

      And the Google-Author's Guild deal actually provided a way to provide some revenue to authors of out-of-print books. Nearly all books go out of print after several years, never, ever to even be printed again so nearly all authors face this issue.

      So this is a lose-lose-lose situation (for Google, the public, and author's of out of print books).

      That so many books can be in the public domain and yet be unavailable is largely the result of the constant expansion of copyright at the behest and for the benefit of corporations that own publishing rights that has plagued society throughout the Twentieth Century.

      --
      Starships were meant to fly, Hands up and touch the sky - Nicky Minaj
    10. Re: for free by Robotech_Master · · Score: 2

      No, you're wrong. Even all the way back in 2004, Google was talking about making books easy to search, not making them available for free. Making them available was the Authors Guild's idea.

      --
      Editor Emeritus and Senior Writer, TeleRead.org
  2. This is an old article; has anything new happened? by mellon · · Score: 3, Interesting

    I saw this go by back in April and was made sad by it. Now I am being made sad by it again. I wonder how hard it would be to crowdsource the same work. Like, just have everybody who thinks this is a tragedy do 10 books, and see how many that adds up to. The Google OCR API is available for use, and I think they may even have open sourced it so you don't have to run it in the cloud.

  3. AI silly! by eager_agony · · Score: 3, Interesting

    They have a great corpus to train their AI with now. Maybe the best in the world.

  4. Face it by thegreatbob · · Score: 4, Interesting

    I'm sure others will note... Google almost certainly just wanted the data. Why would they need/want anything else out of the arrangement?

    --
    There is no XUL, only WebExtensions...
  5. They got 1 terabyte in and... by Lodragandraoidh · · Score: 2

    I think what happened is they got 1 terabyte in and realized that the data started to repeat over and over...and over.

    --

    Lodragan Draoidh
    The more you explain it, the more I don't understand it. - Mark Twain
  6. Why not campaign for better Copyright laws by clickety6 · · Score: 4, Insightful

    Hey Google, use some of that vast money stockpile to undo the damage that companies have been doing to Copyright laws. Get some reductions in copyright duration to something more reasonable (15 years!) and then you'll be able to release the vast majority of your scanned books.

    --
    ----------------------------------- My Other Sig Is Hilarious -----------------------------------
  7. Re:This is an old article; has anything new happen by grumbel · · Score: 3, Informative

    I wonder how hard it would be to crowdsource the same work.

    Project Gutenberg has been at it since the 70's. But they currently only have 54.000 books, not a whole lot compared to Google's 25 million books.

  8. The fallacy of the "new Alexandria" by Robotech_Master · · Score: 5, Informative

    Getting to see the books is not what Google Books is for. It was never what Google Books was for. You've bought into the fallacy promoted by the Authors Guild, who came in after the fact and tried to wangle their lawsuit against Google Books into an orphaned-works library without actually having any authority to do so. Google shrugged and went along with it, because why not, but it was never what they had intended.

    From the very beginning, Google Books (nee Google Print) was intended to populate a search database so people could search within paper books as easily as they could search within the web. If the book was still in copyright, then finding that book to read was the searcher's problem. (Interlibrary loan works a treat.) Google was very straightforward about that in early blog posts and publicity about the project. Don't blame them for falling short of the Authors Guild's goals. Those goals were never theirs to begin with. See the link in the first paragraph for more information.

    --
    Editor Emeritus and Senior Writer, TeleRead.org
  9. What is stopping Google from operating as a librar by kiviQr · · Score: 3, Interesting

    What is stopping Google from operating as a library? For each city have a pool of ebooks that users can borrow for a week. They could have books that you can borrow for 1 min for search purposes. It should be cheaper that publicly funded libraries.

  10. This has been wonderful for me by idji · · Score: 4, Interesting

    Google Books helped me find books from 1838 that mentioned ancestors of mine by name and what they were doing. This is priceless to me.

  11. Re:Copyright Insanity by bws111 · · Score: 2

    RIAA established 1952
    MPAA established 1922
    Disney Corp founded 1923
    Berne Copyright extension of copyright to authors death + 50 years - 1908

  12. Re:Dead [Re:for free] by BronsCon · · Score: 4, Insightful

    I actually read that as "dead authors don't need to get paid, copyright shouldn't outlive the author". I suppose I could stretch it to imply that copyright should be more limited than that, as well; say, the 14 years it was originally. And remember, when copyright was 14 years, printing and distribution were much slower than what we're capable of today. A book that would have taken a year to go to press and be shipped across the globe can now arrive on everyone's shelf tomorrow; if anything, that should further shorten copyright terms.

    --
    APK quotes people (including myself) without context and should not be trusted. Just thought you should know.
  13. For thousands of years, life sucked by Geoffrey.landis · · Score: 2

    For thousands of years, authors, artists and musicians didn't expect to get paid for their work, and they did it anyway.

    And for thousands of years peasants starved to death in years when the local harvest was poor, and died of disease when a plague passed through. And, more to the point, had their stuff taken away by anybody who passed by who was equipped with swords, spears, arrows, and armor.

    Your point is that ancient societies were somehow better than ours? That societies for thousands of years condoned slavery, so we should, too?

    --
    http://www.geoffreylandis.com
  14. Re:It will be remembered in history. by pinkocommie · · Score: 4, Informative

    This is apocryphal. While that sort of sentiment existed (or still exists?) within Islam. The claim that Omar ordered it's burning first appeared many centuries later. Also the actual burning of the library was centuries prior to the advent of Islam.

  15. Christians burnt the Library of Alexandria by ghoul · · Score: 2

    Muslims preserved the knowledge of Greek and Roman cultures while Christians were busy burning it. In fact by the time of the Muslims conquering Egypt the Christians had held sway for centuries in Egypt and the library of Alexandria was long burnt.

    --
    **Life is too short to be serious**
  16. So, uhh, Archive.org anyone? by Myself · · Score: 3, Informative

    Meanwhile, archive.org is scanning a thousand new books every day and nobody's writing news stories about it...

  17. Re:Dead [Re:for free] by BronsCon · · Score: 2

    How did authors make money before copyright? I mean, written works predate copyright, so someone must have paid for them, right? The original 14 years was a gift to authors, as it allowed them to earn a bit more than the initial writing would afford them, while balancing against the greater good of an enriched society via the public domain.

    If an author hasn't made anything in the handful of years before their work goes out of print (and that's a smaller handful if it's not selling), they're not going to make anything on that work before they die and their family isn't going to make anything on it in the 70 years that follow. Because it's out of print. Because it wasn't selling.

    If you haven't made a profit in 14 years, you're not going to. If you haven't made something else profitable in 14 years, I should say you've not contributed enough to society to deserve to continue profiting.

    Copyright is what gets me paid, by the way. If it took me 14 years to profit off of my work, I'd fucking starve.

    --
    APK quotes people (including myself) without context and should not be trusted. Just thought you should know.