Slashdot Mirror


The Future of Google Search and Natural Language Queries

eldavojohn writes "You might know the name Peter Norvig from the classic big green book, 'AI: A Modern Approach.' He's been working for Google since 2001 as Director of Search Quality. An interview with Norvig at MIT's Technology Review has a few interesting insights into the 'search mindset' at the company. It's kind of surprising that he claims they have no intent to allow natural questions. Instead he posits, 'We think what's important about natural language is the mapping of words onto the concepts that users are looking for. But we don't think it's a big advance to be able to type something as a question as opposed to keywords ... understanding how words go together is important ... That's a natural-language aspect that we're focusing on. Most of what we do is at the word and phrase level; we're not concentrating on the sentence.'"

15 of 148 comments (clear)

  1. natural language is an oxymoron by yagu · · Score: 3, Insightful

    I tend to agree with Norvig's focus on keywords and less emphasis on natural language. Trying to even define a natural language on top of a query engine introduces a layer of complexity probably unnecessary. Natural Language even introduces a level of noise to interfere with accurately (as possible) defining what the user is asking for.

    Google has done a good job, and they get better each iteration figuring out what the user is looking for. I find their suggestion an effective way to not only constrain a query, it actually provides a way to spell check in a pre-emptive way. If you've not used this, install the Firefox Google toolbar, or use the experimental Google "Suggest". Often Google will provide suggestions in the drop down menu that refine your search in ways you hadn't considered that drive to a more direct and accurate representation of your intended query. Of course if their suggestions don't satisfy, you get to continue typing your keywords to your heart's desire.

    (I have to offer an example of suggestion's effectiveness. I often Google to get to the Chicago Tribune (I don't visit there often enough to have created a bookmark, plus it's easy to do this in anyone's browser). Simply typing the first four letters, "chic", I see the first suggestion is "Chicago Tribune". A simple TAB and RETURN, I'm on the Google page with the first link or so my link to the Tribune (with the added bonus of Google's breakout of sublinks).) Your mileage may vary (Google's ranking system may vary the order and options that appear in the drop-down over time), but I find it an amazingly effective research tool (suggestion, not the Trib).

    Natural language is mostly trying to guess intent with structure and key words (as opposed to keywords), but at the end of the day, if you filter out the natural language, and focus on keywords you're going to end up in close to the same place.

    1. Re:natural language is an oxymoron by porcupine8 · · Score: 4, Interesting

      I would find the drop-down suggestions a lot more useful if I could read more than the first two words. As I type in, for example "Chicago dog boarding" all I see is a list of "Chicago do... " I'm sure there must be a way to make the search space take up more of the toolbar (I don't really need that much room in the URL space, since most URLs that long are nonsense), but I don't know how and I don't really want my browser window to be the width of my screen.

      --
      Warning: Apple/Nintendo fangirl. Likes her electronics cute & cuddly. May be rabid.
    2. Re:natural language is an oxymoron by pluther · · Score: 4, Informative

      "Why did World War I start" or "what does a duck eat" are questions that require too much understanding and explanation of the concepts.

      Not at all. I do that kind of question in Google all the time.

      Googling for "Why did World War I start" brings up, as the first result, an article titled "The Causes of World War I".

      Followed by a few million more hits if that one isn't good enough.

      And the question "What does a duck eat" gets many hits as well. The first one has, in the summary:

      Ducks in the wild eat a variety of plants, insects, and native foods that will differ from...

      I know it's just picking out keywords from the query and matching them to the sites, not trying to parse the natural language, but it works pretty damn well.

      --
      If the masses can keep you down, you're not the Ubermensch.
    3. Re:natural language is an oxymoron by 0100010001010011 · · Score: 3, Funny

      Fine, those were easy. Lets see google understand this one: Women.

  2. Google and Asimov's fictional Multivac by dpbsmith · · Score: 3, Interesting

    Isaac Asimov's fictional Multivac was a huge computer with some near-universal knowledge database that answered natural-language questions, giving Asimov all sorts of opportunities to present philosophical conundrums as entertaining short stories.

    In the 1960s and thereabouts, when I used to hack around on minicomputers, but personal computers weren't well known to the general public, I always found it difficult to explain what computers did. One of their commonest questions was "Well, how does it work, do you type in questions and does it answer them?" Programming in assembly language didn't really fit that description.

    Many technological fantasies seem to remain surprisingly distance. I tried ViaVoice and gave up: it's not a "voice typewriter." Roomba is not a general-purpose housekeeping humanoid-form robot, and neither are the machines that weld automobile chassis.

    However, it seems to me that Google is within striking distance of Asimov's "Multivac" fantasy.

    Incidentally, if you type in queries as complete sentences Google seems to do any worse than if you don't. Sort of the converse of adventure games, where one begins by typing "Walk over to the table on the left and pick up the silver key with your left hand" and quickly learns to use telegraphic style: "Go table. Take key."

  3. Re:phrase/sentence? by harmonica · · Score: 4, Informative

    A phrase is part of a sentence. WP

  4. this is also why by circletimessquare · · Score: 5, Insightful

    text-to-speech or speech-to-text is also useless (unless your blind/ deaf/ driving a car)

    the idea of interacting with a computer like a human is an artificial hangover from being introduced to the computer the first time. after using it for awhile, you realize that ineracting with a computer, in small limited ways, like searching information, is easier NOT using natural language

    for the very simple reason that it takes more thought, and more typing to interact naturally. it is easier to train a human to interact with a computer than it is to train a computer to interact with a human. and for the human, it is more rewarding, because the human realizes he doesn't need to exert so much effort

    "what is the capital of france?"

    versus

    "france capital"

    if you were to shout "france capital" at someone, it would be rude and confusing. but for a computer, it's actually superior

    it is the conservation of communication effort at work here that wins out over natural language in computer interaction

    --
    intellectual property law is philosophically incoherent. it is your moral duty to ignore it or sabotage it
  5. Real questions ... by foobsr · · Score: 4, Interesting

    Typing "What is the capital of France?" won't get you better results than typing "capital of France." ... Most of what we do is at the word and phrase level; we're not concentrating on the sentence. We think it's important to get the right results rather than change the interface.

    This misses situations like searching for "That sf-short-story were the crew of the visiting spaceship is given a dog as a present" in which googling failed, at least for me, or, more technically, when you have absolutely no idea about what the relevant terms within the outcome might be. In short, if you have a real question.

    CC.

    --
    TaijiQuan (Huang, 5 loosenings)
  6. The capital of France by CruddyBuddy · · Score: 5, Funny
    Paris Hilton says:

    "That's easy! The capital of France is 'F'."

    --
    ----------
    Any problem can be made unsolvable if there are enough meetings made to discuss it.
  7. He's lying by helicologic · · Score: 4, Insightful

    I think Norvig's lying. Google may not be pursuing linguistic structure above the phrase level in searches, but I'd bet a donut they're working their asses off trying to analyze crawled docs linguistically. To get relevance, they need to extract what a document is about. That implies sentence-level syntax analysis, which is input to sentence-level semantics, which is input to paragraph-level semantics, which is input to "pragmatic" analysis. I think what he's not saying is that the place the linguistic research dollars are going is elsewhere than parsing "Where is Paris?"

  8. Re:What's really the story by theStorminMormon · · Score: 3, Insightful

    I think that actually misses the point. If you've worked as an engineer or a consultant - or even if you've just helped people search for stuff on Google - you probably have realized that THEY DON'T KNOW WHAT TO ASK FOR. A really good consultant/engineer is someone who has the ability to figure out what a person wants based on what they say.

    Even if you mastered natural language (and I'm not saying that's a surmountable task) I think people would be shocked to see that Google searches would still be frustrating.

    I'm not just saying "blame the user", I'm saying that language itself is not even the last obstacle to overcome. You're going to need to figure out an program that not only understands natural language, but also context, culture, etc.

    Getting an AI of near-human intelligence is not enough, because to be really good at getting people the answers to questions they can't ask you have to be of above-average capability.

    --
    The Southern Baptist Convention has creationism. On Slashdot, we have porn.
  9. Re:What's really the story by Alt_Cognito · · Score: 3, Interesting

    Bah, the engine just has to ask refinement questions. Of course, this could be interesting:

    User: Who is the winningest coach in football?
    Search Engine: Did you mean, What coach has the most wins in football?
    User: Yes
    Search Engine: Did you mean American football?
    User: Yes
    Search Engine: NFL NCAA CFL...?
    User: Umn, all of the above
    Search Engine: Are you sure?
    User: What?
    Search Engine: Are you sure you want to compare all years, after all, NFL rules significantly changed in 2001, and leagues are not comparable...
    User: Yes.. Yes, please compare them all....
    Search Engine: You know winningest isn't a word right? .... And so on and so forth...

  10. Re:The problem with natural language searches... by vertinox · · Score: 3, Insightful

    Most linguists currently believe in the existance of something called "universal grammar", which is a set of properties common to all acquirable human languages (that is, langauges which can be learned as a native language).

    The argument against universal grammar is of course is non-Latin languages like Japanese (and possibly Russian) which don't play by the rules. I'm not really a language expert on either, but I'm tried to learn Japanese and its really tough.

    Everything is relationship based off the speaker and to the person or object he is talking about and then the audience. As in... If I'm talking about a pencil sitting on my desk, it has a different tense than a pencil on your desk and then a difference tense in someone else's hand or a pencil that is sitting at a far off place (-sara or -kara? I can't remember). And we haven't even gotten to issues about ownership like if it was in my hand or your hand.

    Whereas in Latin based languages it is more concerned about action or tense of ownership but not relationship to the speaker or audience. Hence... It is argued universal grammar does not apply in that respect.

    --
    "I am the king of the Romans, and am superior to rules of grammar!"
    -Sigismund, Holy Roman Emperor (1368-1437)
  11. Re:What's really the story by Anonymous Coward · · Score: 3, Funny

    User: NAKED WOMEN!
    Search Engine: Would you prefer woUser: NOW!!!
    Search Engine: *sigh* As you wish...

  12. What could possibly be wrong with that? by Dan+East · · Score: 3, Insightful

    > wii
    Your query does not include a verb.

    > find wii
    Whose "wii" do you want me to find?

    > find wii review
    Unable to find any reviews authored by "wii".

    > find review about wii
    No reviews found concerning the common noun "wii".

    > find review about Wii
    Here is the most recent review about the proper noun "Wii": [url to a page full of keywords related to Wii]

    > find review about Wii order by relevence
    "relevence" is not an English word. Did you mean "relevance"?

    > find review about Wii order by relevance
    Here is the most relevant review about Wii: [url to a 2 year old pre-review of the Wii before it was launched]

    > find review about Wii order by relevance then date
    Here is the most recent and most relevant review about Wii: [url to a fanboy site]

    > find all reviews about Wii order by relevance then date
    Working...

    > abort
    Abort what?

    > abort search
    I am currently performing 1,231,415 searches. Which search do you want me to abort?

    > abort last search
    You do not have permission to abort others' searches.

    > abort my last search
    Last search aborted.

    > find several reviews about Wii order by relevance then date
    "Several" is not a quantifiable adjective. Do you mean "seven"?

    > find seven reviews about Wii order by relevance then date
    Here are your results. For better search results please capitalize the first word of sentences, and end sentences with proper punctuation.

    Dan East

    --
    Better known as 318230.