Slashdot Mirror


Amazon Plan Would Allow Text Search Of Books

emmastory writes "The New York Times is running a story (free registration required) about a new development at Amazon - they plan to assemble "a searchable online archive with the texts of tens of thousands of books of nonfiction." Users would only be able to read a certain portion of the text from any one book, but it sounds promising nonetheless. The Times article suggests that this is part of a larger strategy to compete with Google and Yahoo by making Amazon an authoritative source of information on everything book-related."

14 of 193 comments (clear)

  1. O'Reilly on steroid? by UnderAttack · · Score: 5, Informative

    Would this be like OReilly's Safari online books on steroids? Safari is my favorite bookstore for a while now.

    --
    ---- join dshield.org Distributed Intrusion Detec
    1. Re:O'Reilly on steroid? by javatips · · Score: 2, Informative

      Safari is not a book store. It's a renting library where you can only get a section of a book at a time (unless you are permanently connected to the Internet).

      It would be a book store if you could buy and download a complete book so you can read it however it please you (online or offline, on-screen or off-screen).

  2. Re:Brilliant idea by ceejayoz · · Score: 2, Informative

    It would be very valuable to be able to open a chapter of the book and give a read over it, you know, like in a real fucking bookstore.

    Amazon.com has their "Look inside this book" feature on a lot of titles, which lets you read a scanned excerpt of the book and see what you think. Just like in a real fucking bookstore!

  3. Re:legal? by keyslammer · · Score: 4, Informative

    It is well established that you can cite portions of a work (which seems to be what they're doing), if the portions are especially large, I would imagine that they'd have to get permission from the publishers.

    Of course, as Amazon, they're probably in a position to do so.

  4. Change the world... by mgcsinc · · Score: 2, Informative

    Some wealthy do-gooder could pay amazon to use this feature to the public's benefit, linking words such as "porn" to self-help books about sex-addiction and "bomb-making" to a similar book about dealing with pent-up anger...

  5. One rule for them... by Rogerborg · · Score: 2, Informative

    Sure, your honour, I only OCR'd and put my entire book collection up on Kazaa so that people could search for passages before buying them from me. Same with my mp3s and DVDs, now that I think of it.

    Let's look at the fair use provisions in the 1976 copyright act:

    the fair use of a copyrighted work [...] for purposes such as criticism, comment, news reporting, teaching (including multiple copies for classroom use), scholarship, or research, is not an infringement of copyright.

    Purposes such as selling isn't covered, but let's read on, because as with most things written by lawyers for the benefits of lawyers, it's not that clear cut.

    In determining whether the use made of a work in any particular case is a fair use the factors to be considered shall include :

    (1) the purpose and character of the use, including whether such use is of a commercial nature or is for nonprofit educational purposes;

    (2) the nature of the copyrighted work;

    (3) the amount and substantiality of the portion used in relation to the copyrighted work as a whole; and

    (4) the effect of the use upon the potential market for or value of the copyrighted work.

    Well, you work it out. It's a copy of the entire work. That it's offered one piece at a time can't be a defence by itself, otherwise those fragments I upload and download to and from various people over eDonkey would be fine by same argument. The duplication is clearly of commercial nature (for Amazon's benefit), but on the other hand, it's arguably increasing the potential market for the copyrighted work.

    That last one is a very, very interesting provision. If Amazon can argue that making entire copies and distributing parts of them - potentially all of them - for their profit is just increasing the market for the original work by way of advertising and promoting it, why can't I argue that for my eDonkey use?

    If you think this argument is trite, have a look at www.sharereactor.com, which indexes content on eDonkey. You see the "Buy this at Amazon.com" links right there? What is eDonkey doing that's significantly different from Amazon? Are Amazon obtaining each and every rights owners' permission to perform this duplication? I doubt it, so the differences seems to be these:

    It's easier to obtain all the fragments from eDonkey (but not much easier, it can take upwards of a week to completely download a large file). And sharereactor is not for profit, whereas Amazon is primarily interested in their own profit.

    You work out where the morality and legality lies.

    --
    If you were blocking sigs, you wouldn't have to read this.
    1. Re:One rule for them... by aziraphale · · Score: 4, Informative

      > Are Amazon obtaining each and every rights owners' permission to perform this duplication? I doubt it

      Why do you doubt it? You do realise that Amazon has a direct business relationship with every publisher whose books it sells already, don't you? They don't buy their books from Barnes & Noble...

      Amazon's book buyers will offer this facility to publishers (whose salespeople they already work with directly - many publishers will employ one person whose entire job is selling books to Amazon) as a marketing benefit - and charge them for the privilege, no doubt - just as they do today with their 'look inside' feature. In order to keep competitive, publishers will prepare and supply the text in the format Amazon wants. It's really not hard for Amazon to do this at all.

    2. Re:One rule for them... by Rogerborg · · Score: 2, Informative

      What they want isn't necessarily what they get. However, the Amazon T&C's require anyone sending them content for sale to warrant that they have "full authority" to grant a "royalty-free, nonexclusive, worldwide, perpetual, irrevocable right and license to use, reproduce, perform, display and distribute, and adapt, modify, reformat, create derivative works of any content" and further that they can and do grant Amazon the rights to sublicense these rights.

      Any author signing away these rights to a publisher deserves to be royally screwed over. I'd rather self publish and make nothing rather than gamble on receiving pennies under these terms. This flies in the face of the intent of copyright, and illustrates perfectly how completely publishers can now demand outrageous boilerplate licensing terms in return for making money by selling the fruits of other peoples' labours.

      --
      If you were blocking sigs, you wouldn't have to read this.
  6. Your tax dollars at work by Anonymous Coward · · Score: 5, Informative

    The NIH has a good start with something of this nature. The NCBI (part of the National Library of Medicine) has a fully-searchable set of about 20 books. The books are generally cover biology topics, but represent some of the standard texts used in college courses. They call the project Bookshelf and it is entirely free. Several books contain direct links to gene sequences, etc.

  7. What about searching through the old stuff? by machinecraig · · Score: 5, Informative

    I'm surprised nobodys mentioned Project Gutenberg - I mean, they've been OCRing public doman books for a long time now, and there are thousands of texts available... not in some crappy interface that Amazon will use, but in wonderful, sweet, ascii text format. Couple this with some good regular expressions and you're in business... want to see how many times Sherlock Holmes talked about using cocaine? It's elementary!

  8. Re:Brilliant idea by aziraphale · · Score: 5, Informative

    Actually, most of the crappy writeups on Amazon are provided by the publisher, not Amazon at all. You're only looking at Amazon-originated content in the 'editorial reviews' section of a book page if it says 'Amazon.com' at the top. If it says 'From the Publisher', or 'Book Description', it's the publisher that provided it. This does, it must be said, stretch the definition of 'editorial reviews' somwehat.

    Oh, and the books Amazon promotes on its front page, or on section header pages, under headings like 'what we're reading this month' - Amazon doesn't put them there off its own bat - it's done in co-operation with publishers, with publishers buying placements with virtual money called 'co-operative marketing funds', which are allocated on the basis of how much money the publishers' books made for the ookstore the previous year. Same deal with physical bookstores of course - spend co-op money, and you can get your books 'face out' on the shelf (cover showing, rather than spine), or onto an 'end-cap' (a display shelf at the end of a row), or even onto a table display.

    A short time working in publishing is a great way to disabuse yourself of the notion that book stores know or care anything about the books they sell...

  9. cnn link by $exyNerdie · · Score: 2, Informative
  10. Re:How long before? by Anonymous Coward · · Score: 2, Informative

    You can only put around 25,000 books onto a DVD.
    (or, actually, 12,000 books in two formats...)

    Some guy proved this

  11. Re:Patent this by keyslammer · · Score: 3, Informative
    I think that you have 1 year from public annoucement to patent an idea.

    I must retract my former statement: you are correct. According to BitLaw:

    The most important rule, however, is that an invention will not normally be patentable if:
    • the invention was known to the public before it was "invented" by the individual seeking patent protection;
    • the invention was described in a publication more than one year prior to the filing date; or
    • the invention was used publicly, or offered for sale to the public more than one year prior to the filing date.