Google to Offer API
philipx writes "From the ruby-talk archives here's a little interesting snippet from a post you have to check out:
"Here at Google, we're about to start offering an API to our
search-engine, so that people can programmatically use Google through
a clean and clearly defined interface, rather than have to resort to
parsing HTML." It goes on talking about SOAP and I think this is utterly cool."
This is very cool, but how long will it last? How will Google make many(and by extension, stay open) when you don't even have to visit their site?
Good idea. By the way, shouldn't /. have a specific "Google" topic?
The only problem I can see with this is that there was a recent thread on here about Google blocking a lump of IP addresses as someone in there was automatically querying way too often and affecting their load.
With the exposed API I could see, by malice or sheer accident, floods of queries coming in...
Text ads... Open standards for content distribution... If only certain other sites would follow...
ok then your [sic] infringing on my copyright! Could you as [sic] me next time before STEALING my comments for your own?
This is really fantastic. I can already think of a dozen scripts or so that I'd like to write to take advantage of this. I love the fact that this is from a Ruby list, and it's about Google. It's not MSDN and MSN.
They'll need a business model of some sort -- without the ads, and with the potential this has to hammer their servers, they'll need to meter access to the API in some way. But I'll pay -- where do I sign up?
I'll bet that this is how they'll end up making most of their money a couple of years from now.
Could this be in response to the supposed competition from tokohma? open up thier results in some way to increase thier usage?
So how useful might that API be if you can't do anything with it...
Ok, it can be done already, but this would make it possibly too easy...?
Also, this will miss out their ads etc that they get revenue from, I wonder what their long term stratagy is?
The problem with slashdot is that most of its users were bullied and stuffed into lockers as kids!
I just wonder how it will tie into my app. Will it open my browser? Will the Google Bar plugin be the foundation?
We'll just have to wait and see...
US Democracy:The best person for the job (among These pre-selected choices...)
Well?
I've been writing a bookmarking application that directs the user to Google and later remembers the last Google search so you can resume it. This API will simplify the interface significantly and open up a whole new world of possibilities.
Miko O'Sullivan
The first page I visit every morning
---
The following is the preliminary code that a particular Google sysadmin (ian@) is trying out. He'd prefer to have a single WSDL file do all of the configure (from Google's end to client), but he first needs to get some advice from an experienced Ruby hacker.
Also, let's keep in mind that this API will actually be decreasing Google pageviews and hits, which will in turn make their AdWords, AdWordsSelect, and textads less effective. So, it's our duty to continue to support Google and show them that the free/open source software people are behind them 100%. We know that Teoma just doesn't deliver, and Google's already got 3 billion pages indexed and cached.
Support Google today, because they're the future of information indexing on the Web!
--- begin code ---
#!/usr/bin/ruby
require 'soap/driver'
endpoint = 'http://api-ab.google.com/search/beta2'
ns = 'urn:GoogleSearch'
key = 'xxxxxxxxxxxxxxx'
service = 'file:GoogleSearch.wsdl'
query = ARGV.shift || 'foo'
soap = SOAP::Driver.new(nil, nil, ns, endpoint)
# uncomment the next line to dump the traffic on the wire
#
#soap.setWireDumpDev(STDERR)
soap.addMethodWithSOAPAction('doGoogleSearch', ns, 'key', 'q', 'start',
'maxResults', 'filter', 'restrict',
'safeSearch', 'lr', 'ie', 'oe')
r = soap.doGoogleSearch(key, query, 0, 10, false, nil, false, nil,
'latin1', 'latin1')
printf "Estimated number of results is %d.\n", r.estimatedTotalResultsCount
printf "Your query took %6f seconds.\n", r.searchTime
I havent tried to get it to work yet, due to not having ruby installed, but does this imply some sort of subscription service?
Possibly a new way for them to raise revenue? Im assuming that the bold line means the authors key has been blanked out so other people cant abuse this service for free?
Lameness filter encountered. Post aborted! Reason: Too much repetition. :/
The problem with slashdot is that most of its users were bullied and stuffed into lockers as kids!
Are they going to release the source code to the search engine itself? That would be REALLY cool...
We can finally find out how to implement their PigeonRank system...
Perhaps a farmer picking apart a haystack, one piece at a time.
dinner: it's what's for beer
You don't suppose they'll modify their terms of service to accommodate the new API do you?
Think, people.
Last year Google temporarily had an XML interface available using a query like: http://www.google.com/xml?q=slashdot
Of course, now it's just forbidden. I am surprised they would go back to such a service, it would seem to wind up losing revenue for them depending upon whether or not people are good about passing along whatever Ad-words Google returns. They could expect the traffic to be low enough to not matter compared to the continued word-of-mouth benefit. Or access to the SOAP interface could be offered as a subscription model (pure speculation on my part).
-Robert
Google has been an enchantment for me since it's beginning !
:)
They have always made the right decision ! they have offered internet users an incredible asset ! and I was so much grateful when they decided to rescue Deja, a site something I just don't know how I can leave without !
I view them as the most "honest and fair" site on the Net ! and without any doubt the most useful too.
Go Google ! you are showing the right way ! to all these stupid-crapy-portal sites which have invaded the net, I just hope you manage to stay in business and prosper for a loooooong, looooong time
Using the API and a dictionary I could find the most google smacks.
Sean.OutaHere()
They could actually charge for a devkit or usage to break even on the project. Even if it did costsome money, I could see it being well worth the price, if it works well.
.NET Framework community website.
I just wonder how it will tie into my app. Will it open my browser? Will the Google Bar plugin be the foundation?
The post describes a SOAP web service which in most cases is an RPC call in your application of choice. However unlike RPC in days of yore using SOAP to do RPC in applications is relatively easy. If you want to learn more about SOAP I suggest reading A GEntle Introduction To SOAP by Sam Ruby for an overview of the protocol and A Busy Developer's Guide to WSDL 1.1 to see how one could go from defining a WSDL file (as the Google sys admin is trying to do) to actually accessing the web service remotely from a Java application.
There is also a grab bag of resources on XML webservices at the
To answer your question, if the Google API is available as a web service then it can be intergrated into any application at all from command line to dynamic web page to GUI application as long as there is network availability on the host machine.
http://www.ruby-talk.org/cgi-bin/scat.rb/ruby/ruby -talk/37623
;)
SCAT.rb ???
it would have made the creation of my random google searcher a bit easier, and faster.
go get it
Text ads shmext ads. They can easily be ignored. The thing that will pay for this kind of access is actually Google's pay-per-placement plan. Advertisers will pay for their sites being ranked high, not for their banners being shown to us. Any application that uses this API to search google will return those sponsored results, which is as good as a banner view. Actually, if it's targetted (only sponsored sites that are relevant to your search will be shown), both users and sponsors will be pleased.
all of us do nothing but rave about google day and night
for it is a search engine we love, with a company many of us have come to love
I for one would love to see google have its own slashdot icon
Come to think of it, there are plenty of USELESS icons none of us give a damn about
the following are a few:
Heres hoping for a new google icon!!
Just my two cents, all taxes included
Sunny Dubey
3. Do a shit
4. Eat a shit.
They'd better make sure a "clean and clearly defined interface" doesn't violate one of Overture's patents or they may have another suit on their hands...
IS this API going to have A system and method for enabling information providers using a computer network such as the Internet to influence a position for a search listing within a search result list generated by an Internet search engine, because that is what google is being sued for at the momment. Interesting they choose now to release the API. Almost as if they can show that the function is an intrigal part of a different system (by way of this new API), that they have a chance in the courts. I'll let you be the judge!
It isn't a lie if you belive it.
Yes, it would be a very good thing if Google employees used soap. That's pretty disgusting if they don't.
Of course you're going to have clean code if you use a SOAP interface...
A hack and a half, but it actually works!
com.google.soap.search.GoogleSearchFault: Invalid authorization key: xxxxxxxxxxxxxxxr omINSIfNeedBe(Query
:-). It doesn't make much sense for Google to say, "Hey, world, come and use our search services for free without our ads."
at com.google.soap.search.QueryLimits.lookUpAndLoadF
...
Alas, looks like the rest of us won't be able
to play with Google's beta SOAP service. Which makes quite a bit of sense - this would be a great way for Google to allow people to resell Google in a standardized way, be it from inside a program (scary, too easy to reverse engineer) or from some other web service (less scary.
Google benefits from the monetery system in an obvious way. They also benefit from the barter system by vastly adding to the crunch power which hopefully improves their indexing/grading system. Unused clock cycles which would otherwise be wasted can now earn some value for the users and at the same time give google the 'value' for providing their service.
So their 'open' system if presented in the form of barter could actually work for the advantage of both parties involved.
On the other hand, Google would obviously not want you to set up your own search site that passes queries to their engine, harvests the results, and presents them on your own site. That is the obvious target of the "Personal Use" restriction.
As for the "Automated Query" restriction -- well, what do you think they mean by "Automated"? Programmatic access to their engine? They couldn't prevent that even if they wanted to. "Automated" obviously means programs that issue hundreds of queries for data mining purpose. Example: crawling the Groups archives to harvest email addresses.
(This was a matter of some concern to me, when I noticed that the Google Usenet archives included all my company's private groups. I'd innocently used by real corporate email, innocently thinking that the groups weren't accessible outside the company. But the spam volume is still very low. Their bot detection software must be quite good.)
Note that making a simple API available doesn't enable any new kind of access to the Google engine. A clever programmer can already parse the HTML results. The API just makes it easier -- and gives Google another product they can sell licenses for.
The keep adding groundbreaking features to their products and throwing them out as if it were no big deal. Don't they know they're supposed to beat the PR drum every time one of their engineers burps? Bunch of commies!
Oh, I'm convinced now... where do I sign up?
I run on the WOODY API. Advanced Penile Interface.
;) )
Bin Laden too, Afghan Pussies, Inc. (Woodyless of course
OK, I'll ask: what's #1 and #2 ?
They do an output without HTML already, but it looks like they've restricted access to it. Comparet =p rotocol4t =w ashingtonpost? hl=en&q=blah&output=u nclesam
http://www.google.com/search?hl=en&q=blah&outpu
with
http://www.google.com/search?hl=en&q=blah&outpu
with
http://www.google.com/search
-nonymous
In a nutshell, Google wants every query to be triggered by a human typing some words in a text field, and wants the results to be used for what they are: search results.
But there is already a pretty good interface to do just that. You might want to somehow adapt the interface, but that's a pretty boring project. Anything else a programmer can think of is probably infringing. In other words, the API can't be used for anything imaginative. That ruins the fun for most people.
http://www.google.com/xml?q=slashdot
You'll (probably) get an error page.
I read about this on Scripting News in February:
Dave Winer made an inquiry to Google about accessing this XML API.
Their initial response was not very helpful, asking for the link to be removed, and saying that the link is "obviously reserved for Google partners." Eventually, Google let Dave access the API. Now, he sounds like he's under NDA about this.
for an interface to the MSDN KB that actually works. And by works, I mean, returns useful hits on queries. I almost always resort to searching the MSDN KB using Google, usually with quicker, more accurate results. Tell your employers that they should spend some time and money making their online support tools less shitty.
and more like an XML Web service. That means someone could fairly easily make a page that'll let you search google from your cellphone :)
The only problem I can see with this is that there was a recent thread on here about Google blocking a lump of IP addresses as someone in there was automatically querying way too often and affecting their load.
With the exposed API I could see, by malice or sheer accident, floods of queries coming in...
FYI for all of the google lovers...
Google will refuse to do any business with anyone who has trades guns or knives - even if the advertisement you purchase is not related to guns or knives.
See http://www.bowmansbrigade.com/google1.htm
And as for "not having to visit their site," remember that they're not doing huge amounts of banner ads. It's not totally evident that this "destroys" any of their business.
They still get to collect statistics on what queries come from where on what, which doesn't change terribly much whether they're receiving queries as HTML FORMs or XML SCHEMAs, and there's only a little reason for them to care about folks receiving back HTML versus XML
If you're not part of the solution, you're part of the precipitate.
If you run the Ruby script, as is, the result is thus:
#: Exception from service object: Invalid authorization key: xxxxxxxxxxxxxxx (SOAP::FaultError)
If somebody starts abusing a particular key, it's a no-brainer for Google to shut the key off.
If you're not part of the solution, you're part of the precipitate.
Somebody is beginning to see what web services are really for.
No space before exclamation points, please, and use them wisely.
Here
Haven't tried it yet, so I can't say if it works.
-- Brian
The most rabid believers in American Exceptionalism are the exact same people whose policies are destroying it.
CPAN already contains the WWW::Search API to many search engines (including Google until [I am told] they requested it be removed). Yes, internally, it works by parsing HTML, but it exports a (Perl) API.
Frankly, I wouldn't be surprised if they didn't happily take money for better/more complete indexing, they just didn't admit it. "Do you?" "No." "Here's some money." "Oh, all right then, since you asked so nicely."
Rather than making the API something ya gotta pay for, couldn't they simply put it into the terms of service that the ads have to be shown in any software that uses the API? They could possibly offer different types of ads(text, pictures, etc.) so that you could even develop a text based app to use it and still stay within the terms of service. Have a nice little "Report a program not following the terms of service" link on the main page, and have all those people who love google help them out by reporting any programs they find that don't show the ads. Oh, and then also offer a pay-for service if they want so that the program dosen't have to show the ads.
Guys... you do all realize that Microsoft *invented* SOAP... so it MUST be evil.
We should all boycot Google, right?
Hello?
I think this idea will work well for their google search appliance, if not for the internet. Imagine SOAP enabling the local google search engine. Now you can add search capabilities from any of the enterprise application (no matter what o/s, language) by connecting to the search appliance. That would be wonderful!
So let me get this straight:
.Net service would allow them to provide?
.Net service once .Net is available for Linux (wouldn't this allow them to do authentication and billing?)...
Google provides the ability to bounce searches off of their search application over the Internet using an API.
Microsoft writes ".NET" to allow folks to utilize application services over the Internet using an API.
They seem pretty similar in concept. Perhaps I'm missing something here, but isn't what Google is doing essentially the same thing as what a
Shoot, I wonder if they won't just set it up as a
Slash either needs to get a Google box or use these APIs to fix their search feature. There is so much haystack data compared to good needles on Slashdot and the search is so bad that most of the great gems of knowlege that Slashdot has generated might as well have never existed. It can take an hour to find even a popular poster's comments.
Need to reference John Carmack's comments? Sorting him out of the masses is next to impossible. Even a comment poster as prolific as Signal 11 (arguabley slashdots first and greatest Karma Whore) is nearly impossible to find. First 30 matches of how many? You want to sort through jeffy124's 700+ comments and 24 submitted stories just to find the pertinate one I need by hand? Not to mention the benefit to Slashdot's editors, being able to follow a clear history of articles on a given subject to look for repeats and make more informed editorial commentary. If 90% of readers never read the comments, the editors owe that 90% the sort of editorial commentary attached to each story that only good research can provide.
In fact, the editors could try it on an interim basis immediately, and provide the service to readers only if they had the resources. I sort of get the feeling that the editors are still thinking of slashdot as a small time blog run out of their apartment closet server.
Run google on slashdot now and you get the news from three weeks ago. Incorperate a google box or google APIs into Slash so I could search today's news and I would Pay 10 cents of subscription funds per search in a heartbeat.
Editors: look at the number of hits to your current broken search engine. Double that number because a dedicated google box would be so much better it would get used a whole lot more. Multiply that by 10 cents per search. See if the numbers work to afford the initial expenditure to get a nice yellow rack mount google box. Slashdot is sitting on a goldmine of data and no one can search it and Slashdot cannot profit from it without a nice pay per search subscription using the best engine available.
If voting were effective, it would be illegal by now.
Google does not have a pay for placement plan - if you are making reference to the practice of changing the order of search results based on advertiser dollars.
Yes it does. When you search Google, it displays two distinct sets of results side-by-side. One set is based solely on PageRank values; the other (clearly marked "Sponsored Links") is based on advertising dollars. The problem with GoTo was that you had to scroll and click past pages and pages of sponsored links to get to the results scrolled by relevance.
Will I retire or break 10K?
They admit it: see Google Hosted SiteSearch. Pay Google the $$$, and Google will index all your site's pages as often as you want.
It's primarily intended for webmasters of commercial sites who want to outsource their "Search This Site" functionality, but the index does leak into the general web search database.
Will I retire or break 10K?
I was using Googolplex to do it. Now I will migrate!
- Cinema times: write your own program to integrate with your personal organiser to find out when you're free to go to the cinema. Integrate with a movie reviews programmatic interface to see if you'll like the film.
- Airport arrivals and departures: write your own program to keep track of your partner's flight, and get it to send you a text message to your phone when it arrives (through another programmatic interface).
- Real content distribution and integration. None of this IFRAME or remote JavaScript trash. Imagine being able to license content and use SOAP-based RMI to integrate it into your site.
I really love the idea of programmatic interfaces. It's fascinating, because the problem we have now is not a lack of information, but a lack of coherent *integration* of this information.I also think that once people realise the limitless applications and emergent properties that arrise when different sources of information can be easily integrated, a whole new generation of people will be introduced to programming, especially with easy interfaces to SOAP through Perl and the SOAP::Lite module.
google already block php from querying, try wget http://www.google.com/search?q=moo --user-agent=PHP opposed to: wget http://www.google.com/search?q=moo --user-agent=Mozilla it stops fopen and file working, you can either modify the source or use fsockopen though. if they don't want php scripts parsing google, why would they want this? :)
Personally, I tend to think that the reason that Google would launch something like this is to support their new 'Overture-style' auctioned keywords... You'll probably get no-cost search results, as long as someone is paying them to rig the first couple.
Redundant?! *Please* answer and tell me how the hell was it redundant while 95 minutes later a comment duplicating a subset of mine was suddenly interesting. Oh, I get it, you mean redundant in a sense, that others said the same later? I see, it is redundant for some reversed definition of redundancy.
I don't care about the stupid karma, but I do care about people reading and replying to my comments. Now I have karma 50 again, thanks to my other post, so you can safely mod my post up as +2 informative, +1 insightful and +1 interesting, not being affraid that I might get some karma. Thank you for your attention. I'm sure no one will read it now with Score: 1. Of course I'm currently not eligible to Meta Moderate so I can't even complain.
~shiny
WILL HACK FOR $$$
a more clean method for people who want to use google (outside of the normal web interface) means less to transfer from the server, which means less $$$ spent on bandwidth bills.
it a common sensicle
the only fact is that everything is an opinion
People can already access google through an API (perl modules exist already for instance) by making an API that parses the html. There is thus no reason for them to *not* make things available. It costs less to transmit the results without formatting information, its useful to people, .... Don't forget, google is pretty focused on usability; something which is one of the main things which has made them so popular IMO.
SSL Certificate
Ever heard of it?
Think of the Multiplayer Online games with access to this kind of database for content and using it to port parts of it to a game universe... Better extend those schizo wards now , cause it's gonna be rough on some people :)
> /.CONTRADICTION:
Dissing the rule of Satan, only to advocate the
rule of a loving God.
reductio ad absurdum
-I like my women like I like my tea: green-
>Dissing the rule of Satan, only to advocate the rule of a loving God.
You must be a Sun employee to call Scott McNeally's brainchild the reign of a loving God.
:)
makes no sense.
is my min char limit reached yet?
Google doesn't seem to have been updated for about 3 months.
e.g. a search for "GXP120" on Google gets me 29 results.
The same search at www.alltheweb.com gets me 1012 results.
scat..... the act of deriving sexual pleasure from human shit.
i wasnt commenting on the lang they used. prat
I was just thinking about incorporateing a Google search on my websites after an impressive experience with a few websites that employed their Free WebSearch plus SiteSearch feature.
This is even better. With this feature, I'll be able to SSI and/or push results using something as simple as SoapLite to get the job done.
I sure hope other content providers are taking note. Imagine how useful (not to much fun) it would be to snap up stuff from places like MoreOver.Com?
healyourchurchwebsite.com - WWJB?
Check out Jakarta ORO, a Perl 5 Compliant Regular Expression Library for Java at http://jakarta.apache.org/oro/
I've been using this on my Java projects for years, works great! (Of course you have to know how to write Perl 5 regexes, an art in itself!)
Also Java 1.4 is FINALLY bundling some sort of regex library in it's core API, but I haven't messed with it yet.
Actually, I'm a disgruntled ex-employee.
I think of Scott as more like Jim Barksdale
on acid than God.
-I like my women like I like my tea: green-
Of course you may get any CLR port on whatever OS you want, but you will never get a real .net complient OS on thing but WinXX OS !
;-)
:o)
MS gents are maybe idots in communication but their are not stupid in strategy
This is not Java babe
-4R34'.
My favorite "feature" on the full text search implementation in SQL Server 7.0 was that you couldn't pass variables into the searching functions. You had to pass in string literals.
Thus, you either had to allow selects on the table/view and run them from the client app, or allow selects on the table/view and write a nasty ass stored proc to dynamically create SQL and execute that.
They want to be taken as a serious enterprise player, as far as databases go, but they insist on introducing half-half baked features at every iteration.