Slashdot Mirror


Amazon Plan Would Allow Text Search Of Books

emmastory writes "The New York Times is running a story (free registration required) about a new development at Amazon - they plan to assemble "a searchable online archive with the texts of tens of thousands of books of nonfiction." Users would only be able to read a certain portion of the text from any one book, but it sounds promising nonetheless. The Times article suggests that this is part of a larger strategy to compete with Google and Yahoo by making Amazon an authoritative source of information on everything book-related."

35 of 193 comments (clear)

  1. Brilliant idea by seinman · · Score: 4, Interesting

    If this happens, maybe we'll finally be able to find books based on their actual content instead of the (usually pretty crappy) writups that Amazon does on them.

    1. Re:Brilliant idea by steelerguy · · Score: 4, Insightful

      This looks like it is only for non-fiction. Usually not to hard to tell what a non-fiction book is about just by reading the title.

    2. Re:Brilliant idea by tomstdenis · · Score: 5, Insightful

      True enough, but quality is of question too. Not all Calculus textbooks, for example, are of equal educational value.

      It would be very valuable to be able to open a chapter of the book and give a read over it, you know, like in a real fucking bookstore.

      The problem being that stores [brick and mortar] like Chapters.ca stock only self-help dime-a-dozen whim-of-the-minute books. In fact when the local chapters first open you could walk in and buy TAOCP [I did :-)]. Now you would be lucky to get a calculus/algebra/science/anything textbook and at best you can only find those "cheat sheet" books which basically tell you how to solve every problem [but not why the solution works].

      For the most part people have to blindly trust some review from "BigGuy4477" about the value of a 89$ textbook...

      Tom

      --
      Someday, I'll have a real sig.
    3. Re:Brilliant idea by Surak · · Score: 4, Interesting

      You can already do that to an extent on Amazon and on BN.com. For some books, they let you look inside at the Intro, table of contents, and sometimes a chapter or two. You can usually see the liner notes and front and back cover too.

      Very cool. I've purchased books based on the ability to look inside the book.

      Of course this *could* be great for college paper researchers, looking for a quote or two to stick in a research paper. Depends on how much meat you can really get at.

      If it weren't for copyright issues, I'd love to see libraries do something this. You already have the equivalent for magazine articles, but usually you have to either pay or actually go to the library to use their InfoTrac or whatever engine.

    4. Re:Brilliant idea by stephenbooth · · Score: 4, Interesting

      I'm far more likely to pay attention to the customer reviews than a write up from Amazon.

      I guess what I'm saying here is that if you buy a book from Amazon then please take a few minutes to write a quick review saying what you liked/hated about the book, it will help other people make a decision. I've found that Amazon are usually quite fair (well Amazon UK are) and will publish a negative review so long as it's clear and non-offensive. If you write "This book sux." it'll get dumped, something like "This book skips a lot of the detail you need for this sort of level." then it will probably get through.

      Even if I buy a book from somewhere else I'll usually write a review of it on Amazon.

      Stephen

      --
      "Don't write down to your readers, the only people less intelligent than you can't read" - Sign on Newspaper Office Wall
    5. Re:Brilliant idea by Lord_Dweomer · · Score: 4, Interesting
      "Amazon.com has their "Look inside this book" feature on a lot of titles, which lets you read a scanned excerpt of the book and see what you think. Just like in a real fucking bookstore!"

      Except in a 'real fucking bookstore' I can look through the table of contents to see if it has chapters that may sound interesting, and I can then read a little bit from a section of MY CHOOSING. I don't care what amazon wants me to see from a book, and yes I realize some is better than none, but the real beauty of a bookstore is to flip around the entire book with no restrictions and see if you like the whole thing.

      --
      Buy Steampunk Clothing Online!
    6. Re:Brilliant idea by aziraphale · · Score: 5, Informative

      Actually, most of the crappy writeups on Amazon are provided by the publisher, not Amazon at all. You're only looking at Amazon-originated content in the 'editorial reviews' section of a book page if it says 'Amazon.com' at the top. If it says 'From the Publisher', or 'Book Description', it's the publisher that provided it. This does, it must be said, stretch the definition of 'editorial reviews' somwehat.

      Oh, and the books Amazon promotes on its front page, or on section header pages, under headings like 'what we're reading this month' - Amazon doesn't put them there off its own bat - it's done in co-operation with publishers, with publishers buying placements with virtual money called 'co-operative marketing funds', which are allocated on the basis of how much money the publishers' books made for the ookstore the previous year. Same deal with physical bookstores of course - spend co-op money, and you can get your books 'face out' on the shelf (cover showing, rather than spine), or onto an 'end-cap' (a display shelf at the end of a row), or even onto a table display.

      A short time working in publishing is a great way to disabuse yourself of the notion that book stores know or care anything about the books they sell...

    7. Re:Brilliant idea by whatch+durrin · · Score: 5, Insightful
      From my experience with non-fiction (college textbooks) in a "brick-and-mortar" store, the books are usually sealed shut with plastic wrap. That only goes for new books, of course.

      Besides, in college you usually don't have a choice about which textbook to use for the class. I guess you could always purchase supplemental books, but those are usually out of the price range/interest level/time scope of many college students.

      --
      ***
      Radio Shack. You've got questions...we've got blank stares(TM).
  2. Patent this by number_man · · Score: 5, Funny

    Shouldn't somebody patent this process before Bezos does??

    1. Re:Patent this by keyslammer · · Score: 4, Interesting

      Have you missed the dozens of articles about people recently patenting things that've been around for 30+ years, then suing small businesses for cash?

      That's different: that's just blatant disregard for prior art. It's quite a another matter if you announce something in a huge press release and _then_ tried to patent it. You'd look like a moron because you yourself created the prior art! Not that this would stop Amazon...

  3. speaking of searching with Amazon by Artifex · · Score: 4, Insightful

    Have you noticed that they now offer web searching as well, and are also generating third-party ads based upon what you're looking for?

    This development may bite them back - when I look for something on Amazon now, I often find in their ads that other people have the item cheaper. Amazon may get a nickel or quarter for the referral, but they lose the dollars from the markup.

    --
    Get off my launchpad!
  4. Wonder how long before .... by binaryDigit · · Score: 5, Interesting

    ... someone writes a distributed bot to query targeting a specific book and sections to finally retrieve the entire book. If it's a distributed app, then it would be tougher for Amazon to block. You could even have it only go after certain parts of the books at different times to make it tougher. Now not to say that this is a good use of effort, but that never stopped anyone from doing such a thing before :)

  5. Amazon by jester · · Score: 5, Funny

    I remember when doing a search on Amazon for "Database Admin" returned the number 1 response of "The fine art of vaginal fisting" and the reviews that it prompted ... pushing this book up into the top 100 bestsellers. Now what would the ability to read some text from books do ;-)

  6. Perfect! by zapp · · Score: 4, Funny

    I always find it annoying when reading a paper boo when I can't Ctrl-F to find a certain segment.

    Now I can just hop online to amazon, do the search, it will tell me what page it's on, and I can go read it!

    --
    no comment
  7. O'Reilly on steroid? by UnderAttack · · Score: 5, Informative

    Would this be like OReilly's Safari online books on steroids? Safari is my favorite bookstore for a while now.

    --
    ---- join dshield.org Distributed Intrusion Detec
  8. Too bad ... by JSkills · · Score: 4, Interesting
    ... there was no mention of the actual search technology Amazon would be using to allow searching the text of such a large archive of books (why only non-fiction I wonder).

    Looks like they'll be going with a proprietary solution. Even though the article seems to indicate that Amazon is launching this new service as a response to Google's "Froogle" shopping search product, wouldn't partnering with Google make more sense for them?

    1. Re:Too bad ... by binaryDigit · · Score: 4, Insightful

      .. there was no mention of the actual search technology Amazon would be using to allow searching the text of such a large archive of books (why only non-fiction I wonder).

      This type of text searching has been around for a gazillion years and is not really that complex. It really depends on how flexible they want to make the searching. Case in point, wildcards. Google sacrifices flexibility by not allowing you to search on wildcards in their news searches in order to gain speed. Ditto for things like phrase searching, etc. The actual # of docs is pretty much irrelevant wrt search speed (at least directly). It depends more on the features you allow in your query language and the # of hits returned by each part of your query. Plus you're dealing with static data that can easily be distributed.

      The tough part of all this is getting the stuff in digital format. I assume for most current books it won't be a problem. The hassle would be older books that you'd actually have to OCR. Though once they're done, they would have a pretty valuable asset.

    2. Re:Too bad ... by ceejayoz · · Score: 4, Insightful

      Looks like they'll be going with a proprietary solution... wouldn't partnering with Google make more sense for them?

      You are aware that Google's a proprietary solution, right?

      Just because Slashdot loves Google doesn't mean it's all of a sudden non-proprietary!

  9. Re:if I search for "the" will all pages come up? n by mirko · · Score: 5, Funny

    and if you look for "TEH", will you be redirected to Salshdot ?

    --
    Trolling using another account since 2005.
  10. Be careful, Amazon! by grub · · Score: 5, Funny


    Any returns of C or C++ code might get SCO's law team on your ass..

    --
    Trolling is a art,
  11. legal? by hatrisc · · Score: 4, Insightful

    doesn't this infringe on basically every copyright that the publishing industry has?

    --
    I write code.
    1. Re:legal? by keyslammer · · Score: 4, Informative

      It is well established that you can cite portions of a work (which seems to be what they're doing), if the portions are especially large, I would imagine that they'd have to get permission from the publishers.

      Of course, as Amazon, they're probably in a position to do so.

    2. Re:legal? by aziraphale · · Score: 4, Funny

      Crikey - you're right - I bet Amazon didn't think of that. We should get Jeff Bezos on the phone right now and tell him.

      Oh no, hang on, it seems that they have thought of it. Thank goodness for that - no need for an eagle eyed Slashdot reader to point out the error of their ways.

      It seems that, because Amazon has the entire publishing industry over a barrel nowadays, just a few quick calls from Amazon to their biggest suppliers, and a notice in publishers' weekly, and they can go ahead and do whatever they like with the content of the books they sell.

      You know, in some music stores, you can go up to listening points and hear music, on demand, without paying for it. D'you think the RIAA should be told? I bet they'd be really keen to sue their key supply channel for this obvious copyright infringement...

  12. Invasion of Privacy by BillFarber · · Score: 5, Funny

    Isn't this a violation of the privacy of all the people who have biographies for sale at amazon? John Ashcroft could search the text and find out anything they want about Abraham Lincoln! This article should be listed under "Your Rights Online".

  13. this could be huge... by jaxle · · Score: 4, Interesting

    This would be awesome for students. I've always wished I could just execute a search function through a book to find what I was looking for. It can be a p.i.t.a. to use indexes and thumb around until you find what you need.

  14. It's not the writeups, it's the moderation. by Thag · · Score: 4, Interesting

    The real issue is that Amazon's system doesn't do moderation very well, and as a result the reviews get spammed with people who really really like something.

    Or, you get situations where teachers apparently tell their classes to submit reviews on Amazon for a book, and you have 30 reviews that say nothing.

    And, of course, being a bookseller, there is a strong motivation for them to bias things so that positive reviews outweigh negative ones.

    Jon Acheson

    --
    All opinions expressed herein are my own, and not those of my employers, who are appalled.
  15. Like META tags in books? by JZ_Tonka · · Score: 5, Funny

    This will then prompt publishers to include several pages at the beginning of every book with nothing but "sex sex sex sex sex sex..."

  16. Your tax dollars at work by Anonymous Coward · · Score: 5, Informative

    The NIH has a good start with something of this nature. The NCBI (part of the National Library of Medicine) has a fully-searchable set of about 20 books. The books are generally cover biology topics, but represent some of the standard texts used in college courses. They call the project Bookshelf and it is entirely free. Several books contain direct links to gene sequences, etc.

  17. Definitly! by Schezar · · Score: 4, Funny

    "Of course this *could* be great for college paper researchers, looking for a quote or two to stick in a research paper. Depends on how much meat you can really get at."

    College is great in this respect. No matter how crazy, ill-conceived, or outlandish your premise is, there are a thousand nut-jobs out there with nice quotations to support it. This would make it even easier to back that dribble up. Especially late the night before it's due, when you need to support that last flimsy claim in order for your paper to make sense.

    --
    GeekNights!
    Late Night Radio for Geeks!
  18. Research Humanity vs. P2P by tyrani · · Score: 4, Insightful

    This sounds like a good project that they could get some gov't funding for.

    Besides the obvious copywrite problems, if the gov't was to get involved and Amazon (or whoever) was allowed to permit searching an entire book for concepts / keywords but not be able to view the entire book without paying for it this would both increase sales and usefulness.

    If this was the origional model for online music, think of all the problems that would have been avoided. Perhaps a second look at this type of archiving will help the movie industry as bandwidth increases.

    --
    rejected (19) accepted (0)
    Is there a psychological term related to getting your stories rejected on slashdot?
  19. What about searching through the old stuff? by machinecraig · · Score: 5, Informative

    I'm surprised nobodys mentioned Project Gutenberg - I mean, they've been OCRing public doman books for a long time now, and there are thousands of texts available... not in some crappy interface that Amazon will use, but in wonderful, sweet, ascii text format. Couple this with some good regular expressions and you're in business... want to see how many times Sherlock Holmes talked about using cocaine? It's elementary!

  20. RealLife? by ryanoo · · Score: 5, Insightful
    The publishers said they have been guardedly cooperative.

    How authors will react is another question.

    Isn't this what happens in the RealWorld? You walk into a bookstore, open it up, read a few pages and make a decision on whether or not you want to buy it?

    I think publishers and authors would be rather short-sighted to not allow potential customers shop online the same way they shop in brick and mortar stores.

  21. Piece by piece, by pair-a-noyd · · Score: 4, Funny

    search a little, store a little. Search a little store a little more.

    Pretty soon you'll have the entire book.

    They'll have an app out to search the pieces out and stich them together into one complete book..

    Yeah, this will work, thanks for the free ebooks Amazon..

  22. Re:One rule for them... by aziraphale · · Score: 4, Informative

    > Are Amazon obtaining each and every rights owners' permission to perform this duplication? I doubt it

    Why do you doubt it? You do realise that Amazon has a direct business relationship with every publisher whose books it sells already, don't you? They don't buy their books from Barnes & Noble...

    Amazon's book buyers will offer this facility to publishers (whose salespeople they already work with directly - many publishers will employ one person whose entire job is selling books to Amazon) as a marketing benefit - and charge them for the privilege, no doubt - just as they do today with their 'look inside' feature. In order to keep competitive, publishers will prepare and supply the text in the format Amazon wants. It's really not hard for Amazon to do this at all.

  23. Oh Goody by jayhawk88 · · Score: 5, Funny

    *Accessing http://www.amazon.com/search*
    Enter your search criteria:______________
    *Enter search "Moby Dick"*
    Search Complete:

    Moby Dick
    by: Herman Melville

    Call me...
    Would You Like to Read More? This title can be purchased for $14.95 through our...


    *Back Button*
    Enter your search criteria_____________
    *Enter search "Tale of Two Cities"*
    Search Complete:

    A Tale of Two Cities
    by: Charles Dickens

    It was the best of times, it was the...
    Would You Like to Read More? This title can be purchased for $29.95 through our...


    *Back Button-Back Button-Back Button-Close*