Amazon Plan Would Allow Text Search Of Books
emmastory writes "The New York Times is running a story (free registration required) about a new development at Amazon - they plan to assemble "a searchable online archive with the texts of tens of thousands of books of nonfiction." Users would only be able to read a certain portion of the text from any one book, but it sounds promising nonetheless. The Times article suggests that this is part of a larger strategy to compete with Google and Yahoo by making Amazon an authoritative source of information on everything book-related."
Would this be like OReilly's Safari online books on steroids? Safari is my favorite bookstore for a while now.
---- join dshield.org Distributed Intrusion Detec
It would be very valuable to be able to open a chapter of the book and give a read over it, you know, like in a real fucking bookstore.
Amazon.com has their "Look inside this book" feature on a lot of titles, which lets you read a scanned excerpt of the book and see what you think. Just like in a real fucking bookstore!
It is well established that you can cite portions of a work (which seems to be what they're doing), if the portions are especially large, I would imagine that they'd have to get permission from the publishers.
Of course, as Amazon, they're probably in a position to do so.
Some wealthy do-gooder could pay amazon to use this feature to the public's benefit, linking words such as "porn" to self-help books about sex-addiction and "bomb-making" to a similar book about dealing with pent-up anger...
Sure, your honour, I only OCR'd and put my entire book collection up on Kazaa so that people could search for passages before buying them from me. Same with my mp3s and DVDs, now that I think of it.
Let's look at the fair use provisions in the 1976 copyright act:
the fair use of a copyrighted work [...] for purposes such as criticism, comment, news reporting, teaching (including multiple copies for classroom use), scholarship, or research, is not an infringement of copyright.
Purposes such as selling isn't covered, but let's read on, because as with most things written by lawyers for the benefits of lawyers, it's not that clear cut.
In determining whether the use made of a work in any particular case is a fair use the factors to be considered shall include :
(1) the purpose and character of the use, including whether such use is of a commercial nature or is for nonprofit educational purposes;
(2) the nature of the copyrighted work;
(3) the amount and substantiality of the portion used in relation to the copyrighted work as a whole; and
(4) the effect of the use upon the potential market for or value of the copyrighted work.
Well, you work it out. It's a copy of the entire work. That it's offered one piece at a time can't be a defence by itself, otherwise those fragments I upload and download to and from various people over eDonkey would be fine by same argument. The duplication is clearly of commercial nature (for Amazon's benefit), but on the other hand, it's arguably increasing the potential market for the copyrighted work.
That last one is a very, very interesting provision. If Amazon can argue that making entire copies and distributing parts of them - potentially all of them - for their profit is just increasing the market for the original work by way of advertising and promoting it, why can't I argue that for my eDonkey use?
If you think this argument is trite, have a look at www.sharereactor.com, which indexes content on eDonkey. You see the "Buy this at Amazon.com" links right there? What is eDonkey doing that's significantly different from Amazon? Are Amazon obtaining each and every rights owners' permission to perform this duplication? I doubt it, so the differences seems to be these:
It's easier to obtain all the fragments from eDonkey (but not much easier, it can take upwards of a week to completely download a large file). And sharereactor is not for profit, whereas Amazon is primarily interested in their own profit.
You work out where the morality and legality lies.
If you were blocking sigs, you wouldn't have to read this.
The NIH has a good start with something of this nature. The NCBI (part of the National Library of Medicine) has a fully-searchable set of about 20 books. The books are generally cover biology topics, but represent some of the standard texts used in college courses. They call the project Bookshelf and it is entirely free. Several books contain direct links to gene sequences, etc.
I'm surprised nobodys mentioned Project Gutenberg - I mean, they've been OCRing public doman books for a long time now, and there are thousands of texts available... not in some crappy interface that Amazon will use, but in wonderful, sweet, ascii text format. Couple this with some good regular expressions and you're in business... want to see how many times Sherlock Holmes talked about using cocaine? It's elementary!
Actually, most of the crappy writeups on Amazon are provided by the publisher, not Amazon at all. You're only looking at Amazon-originated content in the 'editorial reviews' section of a book page if it says 'Amazon.com' at the top. If it says 'From the Publisher', or 'Book Description', it's the publisher that provided it. This does, it must be said, stretch the definition of 'editorial reviews' somwehat.
Oh, and the books Amazon promotes on its front page, or on section header pages, under headings like 'what we're reading this month' - Amazon doesn't put them there off its own bat - it's done in co-operation with publishers, with publishers buying placements with virtual money called 'co-operative marketing funds', which are allocated on the basis of how much money the publishers' books made for the ookstore the previous year. Same deal with physical bookstores of course - spend co-op money, and you can get your books 'face out' on the shelf (cover showing, rather than spine), or onto an 'end-cap' (a display shelf at the end of a row), or even onto a table display.
A short time working in publishing is a great way to disabuse yourself of the notion that book stores know or care anything about the books they sell...
Amazon plans book-text search
You can only put around 25,000 books onto a DVD.
(or, actually, 12,000 books in two formats...)
Some guy proved this
I must retract my former statement: you are correct. According to BitLaw: