Slashdot Mirror


Amazon Plan Would Allow Text Search Of Books

emmastory writes "The New York Times is running a story (free registration required) about a new development at Amazon - they plan to assemble "a searchable online archive with the texts of tens of thousands of books of nonfiction." Users would only be able to read a certain portion of the text from any one book, but it sounds promising nonetheless. The Times article suggests that this is part of a larger strategy to compete with Google and Yahoo by making Amazon an authoritative source of information on everything book-related."

20 of 193 comments (clear)

  1. Re:Brilliant idea by steelerguy · · Score: 4, Insightful

    This looks like it is only for non-fiction. Usually not to hard to tell what a non-fiction book is about just by reading the title.

  2. speaking of searching with Amazon by Artifex · · Score: 4, Insightful

    Have you noticed that they now offer web searching as well, and are also generating third-party ads based upon what you're looking for?

    This development may bite them back - when I look for something on Amazon now, I often find in their ads that other people have the item cheaper. Amazon may get a nickel or quarter for the referral, but they lose the dollars from the markup.

    --
    Get off my launchpad!
  3. Re:Brilliant idea by tomstdenis · · Score: 5, Insightful

    True enough, but quality is of question too. Not all Calculus textbooks, for example, are of equal educational value.

    It would be very valuable to be able to open a chapter of the book and give a read over it, you know, like in a real fucking bookstore.

    The problem being that stores [brick and mortar] like Chapters.ca stock only self-help dime-a-dozen whim-of-the-minute books. In fact when the local chapters first open you could walk in and buy TAOCP [I did :-)]. Now you would be lucky to get a calculus/algebra/science/anything textbook and at best you can only find those "cheat sheet" books which basically tell you how to solve every problem [but not why the solution works].

    For the most part people have to blindly trust some review from "BigGuy4477" about the value of a 89$ textbook...

    Tom

    --
    Someday, I'll have a real sig.
  4. Good Data by mindshadow · · Score: 2, Insightful

    See... I would pay up to about 50 dollars a month to have free access to reading those books online... I guess the problem would be printing them out and redistributing them. Perhaps maybe just manuals... I am so sick of shelling out 50 bucks so I can read 5 pages about some topic knowing I will never read the rest of the book. Love the web ... information is free ... hate the web ... information is not reliable and all over the place. :(

  5. legal? by hatrisc · · Score: 4, Insightful

    doesn't this infringe on basically every copyright that the publishing industry has?

    --
    I write code.
    1. Re:legal? by DeepRedux · · Score: 3, Insightful

      It looks like Amazon is going to get permission before they do this. First line of the article: "Executives at Amazon.com are negotiating with several of the largest book publishers...". There is no infringement if they have permission.

  6. Re:Patent this by keyslammer · · Score: 2, Insightful

    IANAL, but I think now that they've announced it, it can't be patented (unless it already has been).

  7. Re:Too bad ... by binaryDigit · · Score: 4, Insightful

    .. there was no mention of the actual search technology Amazon would be using to allow searching the text of such a large archive of books (why only non-fiction I wonder).

    This type of text searching has been around for a gazillion years and is not really that complex. It really depends on how flexible they want to make the searching. Case in point, wildcards. Google sacrifices flexibility by not allowing you to search on wildcards in their news searches in order to gain speed. Ditto for things like phrase searching, etc. The actual # of docs is pretty much irrelevant wrt search speed (at least directly). It depends more on the features you allow in your query language and the # of hits returned by each part of your query. Plus you're dealing with static data that can easily be distributed.

    The tough part of all this is getting the stuff in digital format. I assume for most current books it won't be a problem. The hassle would be older books that you'd actually have to OCR. Though once they're done, they would have a pretty valuable asset.

  8. Re:Patent this by ceejayoz · · Score: 3, Insightful

    IANAL, but I think now that they've announced it, it can't be patented (unless it already has been)

    That's funny. Oh... you're not trying to be funny.

    Have you missed the dozens of articles about people recently patenting things that've been around for 30+ years, then suing small businesses for cash?

    The USPTO seems to grant a surprising amount of patents on things that "can't be patented".

  9. Re:Too bad ... by ceejayoz · · Score: 4, Insightful

    Looks like they'll be going with a proprietary solution... wouldn't partnering with Google make more sense for them?

    You are aware that Google's a proprietary solution, right?

    Just because Slashdot loves Google doesn't mean it's all of a sudden non-proprietary!

  10. ==free online books? by KingRamsis · · Score: 2, Insightful

    so i someone wrote a script that sequentially searches for most popular words you can end up with the whole text?

  11. Research Humanity vs. P2P by tyrani · · Score: 4, Insightful

    This sounds like a good project that they could get some gov't funding for.

    Besides the obvious copywrite problems, if the gov't was to get involved and Amazon (or whoever) was allowed to permit searching an entire book for concepts / keywords but not be able to view the entire book without paying for it this would both increase sales and usefulness.

    If this was the origional model for online music, think of all the problems that would have been avoided. Perhaps a second look at this type of archiving will help the movie industry as bandwidth increases.

    --
    rejected (19) accepted (0)
    Is there a psychological term related to getting your stories rejected on slashdot?
  12. Re:O'Reilly on steroid? by Soko · · Score: 3, Insightful

    Not exactly, I think.

    Safari is access to the whole content of the book on-line, as well as searching for text within that content as well as any other books they have available on-line. IOW, Safari is actually a superset of the Amazon thing, since you can pay to read the whole book, not just search through it for snippets and passages.

    I love Safari as well - saves shelf space, trees and frustration (because of the search function). I wouldn't want to read a novel on-line, since a paper book is a better interface for that, but for reference material about programming/networking/Operating Systems etc., Safari works well, since you're in front of a machine anyway. And IIRC, errata in the books is applied directly to the text on-line, and you get the latest edition without having to get another book, just updated content.

    The only time having all of your reference material on-line would be a problem is if you need ref. material to get your Cisco router that connects you to the Internet back on-line.

    Soko

    --
    "Depression is merely anger without enthusiasm." - Anonymous
  13. Re:Brilliant idea by Anonymous Coward · · Score: 1, Insightful

    Most of the "look inside" pages I've seen have been stuff like the table of contents, index, or back page author biography. Not the test that people read.

  14. Re:OCR Be Damned! by buro9 · · Score: 3, Insightful

    They'd probably try and get a few publishers on board so that they can be supplied with digital versions of the text. I can't imagine that they would OCR everything... so they'd negotiate what they could from the outset.

    This would be very easy for publishers to accomodate, and they would do so more willingly if the book was old (e.g. Origin Of Species, etc).

  15. RealLife? by ryanoo · · Score: 5, Insightful
    The publishers said they have been guardedly cooperative.

    How authors will react is another question.

    Isn't this what happens in the RealWorld? You walk into a bookstore, open it up, read a few pages and make a decision on whether or not you want to buy it?

    I think publishers and authors would be rather short-sighted to not allow potential customers shop online the same way they shop in brick and mortar stores.

  16. Re:Brilliant idea by whatch+durrin · · Score: 5, Insightful
    From my experience with non-fiction (college textbooks) in a "brick-and-mortar" store, the books are usually sealed shut with plastic wrap. That only goes for new books, of course.

    Besides, in college you usually don't have a choice about which textbook to use for the class. I guess you could always purchase supplemental books, but those are usually out of the price range/interest level/time scope of many college students.

    --
    ***
    Radio Shack. You've got questions...we've got blank stares(TM).
  17. Great idea. by rice_burners_suck · · Score: 3, Insightful
    This is an excellent idea. I would hope that I'd be able to read a few sentences or paragraphs from the text containing the search phrase, along with whatever pages I am able to preview before buying the book and I hope this will later be extended to fiction.

    Just imagine if Amazon did some deal with the Library of Congress that allowed them to scan in nearly every book published in the United States. Once the information is digitally stored, it could be utilized in other ways as well:

    • Libraries around the country could offer consoles on which you could read any book through a secure connection of some type, preventing unauthorized copying, which would prevent book publishers from agreeing to this. You could essentially read any book, even if the library doesn't have it.
    • Bookstores, schools and other organizations might get in on this network and offer the same service.
    This service doesn't even have to be free. I'd pay a subscription fee to have access to this information, as would the bookstores and whatnot.
  18. Re:this could be huge... by CycleMan · · Score: 3, Insightful
    Remembering my student days, I'm glad I didn't have such a search function. A search function lets you bypass what you're not specifically looking for. In an academic quest for knowledge, sometimes you need all the paragraphs of disclaimers and limiters around the cute phrase you're looking for, or you'll radically misinterpret the phrase.

    One example from current events: Bush said in his State of the Union address, "The British government has learned that Saddam Hussein recently sought significant quantities of uranium from Africa"

    However, several news organizations excluded the first six words of that sentence, and then called the President a liar. The President's intelligence or honesty aside, intentionally excluding these words dramatically distorts the meaning of the phrase, to the detriment of those using the filter.

  19. Re:Brilliant idea by FroMan · · Score: 2, Insightful

    Why don't you just go to the "real f---ing bookstore?"

    If you don't like how an online business does things, don't use the online business.

    If you don't realize the difference between a brick and mortar store providing physical access the the product and an online store providing a digital copy of the product, you need to get your head examined.

    Basically they would be giving the book away. My guess is that the publisher has a problem with that.

    Original point, if you don't like the rules, don't play the game.

    --
    Norris/Palin 2012
    Fact: We deserve leaders who can kick your ass and field dress your carcass.