Amazon Plan Would Allow Text Search Of Books
emmastory writes "The New York Times is running a story (free registration required) about a new development at Amazon - they plan to assemble "a searchable online archive with the texts of tens of thousands of books of nonfiction." Users would only be able to read a certain portion of the text from any one book, but it sounds promising nonetheless. The Times article suggests that this is part of a larger strategy to compete with Google and Yahoo by making Amazon an authoritative source of information on everything book-related."
This looks like it is only for non-fiction. Usually not to hard to tell what a non-fiction book is about just by reading the title.
Have you noticed that they now offer web searching as well, and are also generating third-party ads based upon what you're looking for?
This development may bite them back - when I look for something on Amazon now, I often find in their ads that other people have the item cheaper. Amazon may get a nickel or quarter for the referral, but they lose the dollars from the markup.
Get off my launchpad!
True enough, but quality is of question too. Not all Calculus textbooks, for example, are of equal educational value.
:-)]. Now you would be lucky to get a calculus/algebra/science/anything textbook and at best you can only find those "cheat sheet" books which basically tell you how to solve every problem [but not why the solution works].
It would be very valuable to be able to open a chapter of the book and give a read over it, you know, like in a real fucking bookstore.
The problem being that stores [brick and mortar] like Chapters.ca stock only self-help dime-a-dozen whim-of-the-minute books. In fact when the local chapters first open you could walk in and buy TAOCP [I did
For the most part people have to blindly trust some review from "BigGuy4477" about the value of a 89$ textbook...
Tom
Someday, I'll have a real sig.
See... I would pay up to about 50 dollars a month to have free access to reading those books online... I guess the problem would be printing them out and redistributing them. Perhaps maybe just manuals... I am so sick of shelling out 50 bucks so I can read 5 pages about some topic knowing I will never read the rest of the book. Love the web ... information is free ... hate the web ... information is not reliable and all over the place. :(
doesn't this infringe on basically every copyright that the publishing industry has?
I write code.
IANAL, but I think now that they've announced it, it can't be patented (unless it already has been).
.. there was no mention of the actual search technology Amazon would be using to allow searching the text of such a large archive of books (why only non-fiction I wonder).
This type of text searching has been around for a gazillion years and is not really that complex. It really depends on how flexible they want to make the searching. Case in point, wildcards. Google sacrifices flexibility by not allowing you to search on wildcards in their news searches in order to gain speed. Ditto for things like phrase searching, etc. The actual # of docs is pretty much irrelevant wrt search speed (at least directly). It depends more on the features you allow in your query language and the # of hits returned by each part of your query. Plus you're dealing with static data that can easily be distributed.
The tough part of all this is getting the stuff in digital format. I assume for most current books it won't be a problem. The hassle would be older books that you'd actually have to OCR. Though once they're done, they would have a pretty valuable asset.
IANAL, but I think now that they've announced it, it can't be patented (unless it already has been)
That's funny. Oh... you're not trying to be funny.
Have you missed the dozens of articles about people recently patenting things that've been around for 30+ years, then suing small businesses for cash?
The USPTO seems to grant a surprising amount of patents on things that "can't be patented".
Looks like they'll be going with a proprietary solution... wouldn't partnering with Google make more sense for them?
You are aware that Google's a proprietary solution, right?
Just because Slashdot loves Google doesn't mean it's all of a sudden non-proprietary!
so i someone wrote a script that sequentially searches for most popular words you can end up with the whole text?
This sounds like a good project that they could get some gov't funding for.
Besides the obvious copywrite problems, if the gov't was to get involved and Amazon (or whoever) was allowed to permit searching an entire book for concepts / keywords but not be able to view the entire book without paying for it this would both increase sales and usefulness.
If this was the origional model for online music, think of all the problems that would have been avoided. Perhaps a second look at this type of archiving will help the movie industry as bandwidth increases.
rejected (19) accepted (0)
Is there a psychological term related to getting your stories rejected on slashdot?
Not exactly, I think.
Safari is access to the whole content of the book on-line, as well as searching for text within that content as well as any other books they have available on-line. IOW, Safari is actually a superset of the Amazon thing, since you can pay to read the whole book, not just search through it for snippets and passages.
I love Safari as well - saves shelf space, trees and frustration (because of the search function). I wouldn't want to read a novel on-line, since a paper book is a better interface for that, but for reference material about programming/networking/Operating Systems etc., Safari works well, since you're in front of a machine anyway. And IIRC, errata in the books is applied directly to the text on-line, and you get the latest edition without having to get another book, just updated content.
The only time having all of your reference material on-line would be a problem is if you need ref. material to get your Cisco router that connects you to the Internet back on-line.
Soko
"Depression is merely anger without enthusiasm." - Anonymous
Most of the "look inside" pages I've seen have been stuff like the table of contents, index, or back page author biography. Not the test that people read.
They'd probably try and get a few publishers on board so that they can be supplied with digital versions of the text. I can't imagine that they would OCR everything... so they'd negotiate what they could from the outset.
This would be very easy for publishers to accomodate, and they would do so more willingly if the book was old (e.g. Origin Of Species, etc).
How authors will react is another question.
Isn't this what happens in the RealWorld? You walk into a bookstore, open it up, read a few pages and make a decision on whether or not you want to buy it?
I think publishers and authors would be rather short-sighted to not allow potential customers shop online the same way they shop in brick and mortar stores.
Ryan O'Rourke
Besides, in college you usually don't have a choice about which textbook to use for the class. I guess you could always purchase supplemental books, but those are usually out of the price range/interest level/time scope of many college students.
***
Radio Shack. You've got questions...we've got blank stares(TM).
Just imagine if Amazon did some deal with the Library of Congress that allowed them to scan in nearly every book published in the United States. Once the information is digitally stored, it could be utilized in other ways as well:
- Libraries around the country could offer consoles on which you could read any book through a secure connection of some type, preventing unauthorized copying, which would prevent book publishers from agreeing to this. You could essentially read any book, even if the library doesn't have it.
- Bookstores, schools and other organizations might get in on this network and offer the same service.
This service doesn't even have to be free. I'd pay a subscription fee to have access to this information, as would the bookstores and whatnot.One example from current events: Bush said in his State of the Union address, "The British government has learned that Saddam Hussein recently sought significant quantities of uranium from Africa"
However, several news organizations excluded the first six words of that sentence, and then called the President a liar. The President's intelligence or honesty aside, intentionally excluding these words dramatically distorts the meaning of the phrase, to the detriment of those using the filter.
Why don't you just go to the "real f---ing bookstore?"
If you don't like how an online business does things, don't use the online business.
If you don't realize the difference between a brick and mortar store providing physical access the the product and an online store providing a digital copy of the product, you need to get your head examined.
Basically they would be giving the book away. My guess is that the publisher has a problem with that.
Original point, if you don't like the rules, don't play the game.
Norris/Palin 2012
Fact: We deserve leaders who can kick your ass and field dress your carcass.