Amazon Launches Full Text Book Search
m00nun1t writes "Amazon have launched a new service that allows you to search the full text of books. This sounds like an incredibly useful function as well as technically impressive at this scale. I wonder if a patent is in the works." Or if a patent is already owned.
can you do it with one click?
I can almost hear the screams of joy from the underground book pirates.
How easy can this service be abused, with automatic webbots doing the searching?
I can imagine there might be filters, time limits, and max searchs/day limits for something of this scale, no?
user@host$ diff
2) It returned a lot of results
Conclusion: It works!!!
Back in the early days of the web, when Yahoo was still a catalog of links and not some super news/search/auction/ebusiness/do-it-all website that it is now, searches were much more fun.
.wav samples and more than likely an artist you'd never heard of before. That was the best part, getting introduced to things you hadn't even thought to look for.
You really never knew what would turn up as you traversed the Yahoo directory structure. You start searching for blues music and you'd end up with a list of 15 or so good links with
As search techniques are becoming more refined, we are now able to do specific word searches on websites and now books. That's fine if you know exactly what you are looking for. For example if you want to get that book about 'replicants' you'll find Blade Runner, but you won't find anything else. You won't get any information except exactly the thing you are looking for.
And I think that that is where the problem with this kind of search lies for books/music/etc. If you want to find a song or a book, it most likely isn't going to be a specific word you remember, it will be the tune or the plot, both of which are not searchable.
I don't see this improvement in Amazon's search system as that much of an improvement. A better improvement could be made to the 'We thought you'd like' feature. Instead of finding only what I'm looking for, I'd like to find other things I might also be interested in.
I remember a teacher once telling a class I was in that our essays may be compared to other essays published online to check for plagiarism.
Granted, Amazon.com's feature will only (for now) include 150,000 books, but this may very well be another way to catch plagiarizers. Just type in a suspicious phrase and see if there are any 'hits'.
Even though he said he was 'blown away' by Amazon's new Search Inside the Book feature, Tim O'Reilly has decided not to participate in the program for now. 'If they end up being a Google for published content...we need to think better about what publishers get out of it,' he said.
There's books about everything:
."
Encyclopedia of New Media : An Essential Reference to Communication and Technology -- Steve Jones (Editor); Hardcover
Excerpt from page 0: ". . . post-ranking system used by members the of Web message board Slashdot.org, began as a result of community self- restraint in the face of unrelenting trolls (pointlessly hostile posters). In addition, some cyberspace forums now require . .
See more references to slashdot troll in this book.
The editors of the world reknowned Slashdot has recently proven to the wrld that they are unable to correct small spelling mistackes and grammar issues.
It is really nice, I was using amazon right as they switched it one.
I was searching for books on Object Role Modeling(ORM), I had first done a search for ORM and did not find anything of interest. They then switched it on while I did a search of 'Object Role Modeling', this poped up a few books with the text where it was being used.
Now I can cut+paste my homeworks! yay!
I'd love to be able to browse a giant back catalog, knowing that an original or facsimile copy could definitely be delivered to me.
In other news... Amazon announced that the USPTO has granted them a patent on their proprietary "one click search" technology.
When questioned for comment Google CEO Eric Schmidt said "ug".
Youth in the old days: lookup 'vagina' in a dictionary.
Youth nowadays: lookup 'vagina' in all books on this planet.
I tried the search again today and got nearly 5,000 results, with the capability to actually look inside the book and see if the reference is useful to me. Very impressive indeed, patent or no patent.
Peer Pressure
Bash Amazon all you want, but this is a very useful technology.
In five minutes I was able to find three books that talked about findings first listed in two of my own published scientific papers, yet these books did not cite me, or anyone else, as the source of that information. My lawyer is currently preparing three letters.
I also found two other books in which the author used verbatim quotes and original theories from various interviews I have given, yet both authors passed off the statements as their own. My lawyer is now preparing five letters.
Aside from being used to protect my own research rights, I have found the search system useful for finding topics of interest discussed in certain books which are not referenced in any of the descriptions about the books. I just ordered three books I would not otherwise have ever purchased.
While I don't think highly of all of Amazon's practices, I must hand it to them for whatever technical undertaking created this search feature.
I warmly welcome any initiative that makes more and more books, or parts thereof, available online.
I used to think, like many people, that ebooks just didn't work because 'I like the feel of paper under my fingers'. Since I bought a PDA and discovered the joys of Fictionwise, I just can't go back to these clumsy wood pulp apparels.
Amazon is pretty progressive in this regard, making a great number of their collection available electronically. It was probably fairly easy from there to make their stock searchable. And how I wish the MPAA and RIAA could work like the publishing industry...
The existence of ebooks is NOT threatening traditional books, because people see more value in a printed book over an electronic copy. This is clearly not the case with a CD and a DVD, since most people couldn't care less about the jacket if they have the goods on the CD/DVD. I wish the MPAA and RIAA would understand how to make traditional CDs and DVDs "value-added", and make people less inclined to getting a computer file instead of shelling out the money.
Then again, I guess the case with ebooks is that your typical DVD or CD pirate is just not interested in swapping files to get the latest Stephen King and read it on screen. Not only that, but most of History's greatest books are available for free, and one could probably read free books for the rest of their lives if they chose so.
Or you could save a great deal of time and GO TO A PUBLIC LIBRARY and CHECK IT OUT FOR FREE
Slashdot Eds Link Anonymous Posts With Logged Posts
They Are Vermin Feeding On Each Other's Feces.
I Hate \.
You can read the page it is on and +/- two pages.
This is equivalent of the facility you have in a physical bookstore to open a book and browse a few pages before purchasing. I can see it might be very useful, if they get the majority of books in a field accessible like this.
I wanted a PHP book the other day, and it is very difficult to decidew which one of the plethora available I wanted. So I went to my physoical bookstore. Smaller choice, but I could open each and get an impression of whther ther were slow, detail by detail, dummies books or the sort of high-speed summary I wanted.
Consciousness is an illusion caused by an excess of self consciousness.
Why do I need to enter a credit card number?
We require credit card information for security purposes only. We will not charge your credit card account any fees for using the Search Inside the Book feature.
Uhuh. Security. Whose?
Yeah, I want to be financially secure too !
I don't think there is in the generall case a correct answer to whether collectives should be singular or plural - it depends upon the context.
"Congress have failed to agreee..." because you are talking about a lod of swuablling politicians who are definitely plural.
"Congress has past a bill..." because those politicians have managed to achiueve a consensus and act as as a single entity.
In this case the sungular is correct, because Amnazon as an entity is offering a new service. But you could use the term collectively for all employees of Amazon.
Consciousness is an illusion caused by an excess of self consciousness.
You have to have an account to view the pages. Fine, great. But then it brought up this screen:
By publishers' agreement, we are pleased to offer Amazon.com customers with a valid credit card the ability to view copyrighted pages.
Your account will not be charged.
This one-time process enables you to view limited copyrighted material through our Search Inside the Book feature.
So they'll let you browse the search pages, if you can prove your identity on record and provide them with financial information. No thanks.
As I read this rather interesting post, I am trying to figure out why Amazon took this route rather than the many many routes available to them to publicise or provide a richer experience to the average Joe buyer...
/.ers here i.e., the ability to help researchers to find out obscure stuff that wont find its way even into the google scheme of things but that is not a huge majority so I am left wondering.
Even with a full text search facility I doubt very much if it can come close to matching the experience of flipping through a book at the local book store no matter how effective the searching facility.
I can think of one reason and that has already been mentioned by a few
- ramas opines !!
Neat idea, but some excerpts come out all wrong:
See this for example...
Mass-OCR'ing has it's drawbacks..
Slashdot: stuff for news, nerds that matter, matter for news, stuff that nerd
"grep."
I believe there is a body of prior art for scanning in books and greping them. Is that not one of the oft repeated benefits of ebooks?
Whether or not Amazon can get a patent on a shell script to serve up the results . . . on the web oooooooo, remains to be seen I suppose.
They managed to get one on "Give me one of those, put it on my account and drop it by my house" a "technology" my grocer has been offering over the phone for 40 years that I'm personally aware of.
However, since this sort of "technology" is exactly the sort of thing that the web, and the internet itself for that matter, was invented for I'd have to guess there's a lot of prior art. It's certainly obvious and trivial, but that doesn't seem to count for much these days.
The problem with things that are so obvious and trivial that "everyone" has been doing it for decades is that it's hard to demonstrate in court because no one bothers to document it.
Can you prove your grandfather put his pants on one leg at a time?
Common sense tells you he did, but common sense no longer applies in an age that grants patents to perpetual motion machines and peanut butter sandwiches.
KFG
Umm, maybe you missed the point. It allows me to find others who are stealing from *me*
When you pass off somone else's ideas as those of your own it's called plagiarism.
I'm not suing them for any monetary damages. Just a requirement that my own work be attributed to me.
Don't know but your search for relivent will return zero results every time.
Article in December Wired talks about Amazon's book scanning, how they legally do it, who does it, how many books so far, and protections.
I dont see anything wrong with what the poster is doing. He used Amazons system to identify books whereby his/her work was not correctly attributed.
How is this an Abuse of the legal system???
I have no sig yet I must scream.
You could have robots trolling this section all day.
Uhuh. Security. Whose?
What's your point? You think Amazon is a dishonest porn site that takes your credit card information and disappears the next day?
Yeah, I want to be financially secure too !If that's your mentality, how are you surfing the web?
What the fsck's your point man? What does amazon demanding your credit card number for security have to do with you "wanting to be financially secure"? How did you even get modded up in the first place?
A few million people shop through amazon. You think unauthorized purchases and fradulent credit card transactions show up every month on their statements?Jeez, get a life dude.
Bush is on fire and its not good for my lungs.
A full text search of slashdot, so the editors can search for duplicate articles before they post.
Scott
Thus, your searches will tend to return more results from books that are fully indexed.
Now that I think about it - this is a major incentive for publishers to get their books indexed.
On a Metafilter discussion earlier today someone discovered that it maxes out at about 75 pages. They got this message:
You've reached the page-view limit for this book or you've reached the monthly page-view limit for the Search Inside the Book feature. Feel free to return to the pages you've previously viewed. If you want to see more of this copyrighted material, you can purchase this book. You can also search inside other books. Click here for more information or continue shopping.
So evidently they are keeping track and your quota resets every month. Interesting.
...are included in the search?
A check on "the clocks were striking thirteen" yields seventeen hits, including the Cliff's Notes to Nineteen Eighty-Four and a reference in the Oxford Dictionary of Modern Quotations...
but none to Orwell's Nineteen Eighty-Four itself.
We must conclude that the coverage is spotty.
"How to Do Nothing," kids activities, back in print!
Yes, but searching pages scanned/OCR'ed and highlighting the keywords has been a feature of Google search for a long time:
Google Catalogs (Beta)
It's very probable that they licensed the Catalog Search technology from Google.
---
Support Mozilla. Buy the CD.
What a feat of computing genius! Using computers to search through large bodies of text!!!! Has ANYONE ever done this before?!
If you can read 5 pages of text per search, couldn't you just continually search for a phrase on the 5th page, allowing you to read any book for free with a decent amount of effort?
GL
Read all about it here: http://www.nettle.com/archives/000062.html
Or if a patent is already owned.
This type of editorializing is pathetic in that its only purpose is to stir up the masses. Gee...now let's take a look shall we? 20% of the comments are "patents suck" or "isn't this some example prior art"?
This story is about a new feature people...it's not about a patent. Wipe the froth from your mouths and comment on the merits (of lack of) the feature...not on a completely fabricated hypothetical comment meant to incite you into a frenzy.
Well at least they don't refer to a liquid as 'gas' like the Americans do when talking about petrol.
Drill baby drill - on Mars
I was stuck when working on a problem set; I Googled for a while and found out that there's a bunch of helpful info in one particular problems and solutions book. Curious about the book, I went on Amazon, and lo and behold, I can actually read the book. So, I look at the table of contents, find the relevant section, and search for the heading of that section. I can now read two pages from it. Not a problem; just pick a phrase on the second page and use it as a search query. Lather, rinse, repeat.
That, of course, would be impractical to do for more than ~4 pages (which was what I needed), but you get the point.
In a couple of hours I joined a few other guys working on the set, and it turned out they had just bought the book. There was a big "Doh!" when I showed them my printouts.
Now, if I actually found the book genuinely useful as a result of this experience, I'd buy a hardcopy. But I for now I think I'll stick with the current method. And I suspect many people might do just that: oftentimes there are references that aren't crucial to have, but convenient to turn to on a few occasions. The book search feature is perfect for those.