Amazon Launches Full Text Book Search
m00nun1t writes "Amazon have launched a new service that allows you to search the full text of books. This sounds like an incredibly useful function as well as technically impressive at this scale. I wonder if a patent is in the works." Or if a patent is already owned.
How useful is this, considering that we can't see what's in the books before buying?
Sure, you can search for some random phrase. But who's to say it's not out of context, or there's nothing more that's relivent in the book?
please /. spell your titles right...! launches with an e!
Back in the early days of the web, when Yahoo was still a catalog of links and not some super news/search/auction/ebusiness/do-it-all website that it is now, searches were much more fun.
.wav samples and more than likely an artist you'd never heard of before. That was the best part, getting introduced to things you hadn't even thought to look for.
You really never knew what would turn up as you traversed the Yahoo directory structure. You start searching for blues music and you'd end up with a list of 15 or so good links with
As search techniques are becoming more refined, we are now able to do specific word searches on websites and now books. That's fine if you know exactly what you are looking for. For example if you want to get that book about 'replicants' you'll find Blade Runner, but you won't find anything else. You won't get any information except exactly the thing you are looking for.
And I think that that is where the problem with this kind of search lies for books/music/etc. If you want to find a song or a book, it most likely isn't going to be a specific word you remember, it will be the tune or the plot, both of which are not searchable.
I don't see this improvement in Amazon's search system as that much of an improvement. A better improvement could be made to the 'We thought you'd like' feature. Instead of finding only what I'm looking for, I'd like to find other things I might also be interested in.
I remember a teacher once telling a class I was in that our essays may be compared to other essays published online to check for plagiarism.
Granted, Amazon.com's feature will only (for now) include 150,000 books, but this may very well be another way to catch plagiarizers. Just type in a suspicious phrase and see if there are any 'hits'.
I did a quick search using their demo for "Curse of the Bambino" in the 'Try Searches' area of the page linked to in the main /. story. After choosing the first book (Curse of the Bambino) I did another search just within that book (from the link on their page) for "Bambino." This turned up 129 of the 240 pages. Browsing through pages. Since you can display a couple of pages before and after the chosen page, it's easy to get to the rest of the pages in the book by just choosing a word on the page preceding/following the one displayed and doing another search for that word. I can't imagine this won't be abused...wonder how Amazon will deal with this. Perhaps a limit on each account for pages viewed per book/time period?
I'd love to be able to browse a giant back catalog, knowing that an original or facsimile copy could definitely be delivered to me.
Yeah, funny. But, I would bet that there will be a patent on this. I would also bet it has already been applied for. I mean really, this is actually really inovative for them, there must be something patentable in this.
Anyways, stay tuned, I believe the Patent Office takes about a year these days to issue a patent?
The story will of course will run here on slashdot.
You have to have an account to view the pages. Fine, great. But then it brought up this screen:
By publishers' agreement, we are pleased to offer Amazon.com customers with a valid credit card the ability to view copyrighted pages.
Your account will not be charged.
This one-time process enables you to view limited copyrighted material through our Search Inside the Book feature.
So they'll let you browse the search pages, if you can prove your identity on record and provide them with financial information. No thanks.
How easy can this service be abused, with automatic webbots doing the searching?
Not so easily. It's easy to see why. The books will be scanned in using OCR. These days a fast and convenient and almost error-free process. But not entirely error-free. Good enough to find documents that are highly relevant to a particular keyword (if "hydraulics" occurs 9 times, what are the odds of OCR getting it wrong all 9 times?) but not good enough for entirely automated book-to-text.
If amazon would display highlighted portions of the books contents if would probably not exceed a few lines, just like google doesn't present entire webpages in it's result screen). If they did want to show more, they'd have to show an image of the scanned in page anyway, since OCR errors would not be very pretty. (A lot of digital archiving products use a similar approach; they index PDF files that contain the OCR'ed text, invisible to the end-user, and the scanned pages as content which the end-user looks at).
Besides, to search for each page of a book, you'd have to search for a keyword on each page of that book. Such keywords would most easily be extracted by scanning in the book via OCR anyway!
SCO employee? Check out the bounty
Some people would say that "most of History's greatest" music is also available for free. I for one prefer modern music over Bach, but the classics are free.
I dont see anything wrong with what the poster is doing. He used Amazons system to identify books whereby his/her work was not correctly attributed.
How is this an Abuse of the legal system???
I have no sig yet I must scream.
You could have robots trolling this section all day.
Uhuh. Security. Whose?
What's your point? You think Amazon is a dishonest porn site that takes your credit card information and disappears the next day?
Yeah, I want to be financially secure too !If that's your mentality, how are you surfing the web?
What the fsck's your point man? What does amazon demanding your credit card number for security have to do with you "wanting to be financially secure"? How did you even get modded up in the first place?
A few million people shop through amazon. You think unauthorized purchases and fradulent credit card transactions show up every month on their statements?Jeez, get a life dude.
Bush is on fire and its not good for my lungs.
Obviously Amazon is aware of all of the "Mickey Mouse" and "Slashdot" type accounts that the New York Times garners. I would assume that Amazon's intent is that by requesting some information that you would not be prepared to share with others they can avoid this, and thus prevent some abuse. Let's face it, Amazon isn't some dodgy peddler of porn and pills that trades from a different URL each week, plus if you have got an account, it was probably to order something, which means they already *have* your credit card number.
UNIX? They're not even circumcised! Savages!
Or if a patent is already owned.
This type of editorializing is pathetic in that its only purpose is to stir up the masses. Gee...now let's take a look shall we? 20% of the comments are "patents suck" or "isn't this some example prior art"?
This story is about a new feature people...it's not about a patent. Wipe the froth from your mouths and comment on the merits (of lack of) the feature...not on a completely fabricated hypothetical comment meant to incite you into a frenzy.