Slashdot Mirror


Google Crawls The Deep Web

mikkl666 writes "In their official blog, Google announces that they are experimenting with technologies to index the Deep Web, i.e. the sites hidden behind forms, in order to be 'the gateway to large volumes of data beyond the normal scope of search engines'. For that purpose, the engine tries to automatically get past the forms: 'For text boxes, our computers automatically choose words from the site that has the form; for select menus, check boxes, and radio buttons on the form, we choose from among the values of the HTML'. Nevertheless, directions like 'nofollow' and 'noindex' are still respected, so sites can still be excluded from this type of search.'"

2 of 197 comments (clear)

  1. Re:Anecdote from Google by Colonel+Korn · · Score: 0, Troll

    The world needs web hosts that block all Google IPs.

    --
    "I zero-index my hamsters" - Willtor (147206)
  2. Re:In other news, by Rogan's+Heroes · · Score: 0, Troll

    Well if you're a stupid enough developer that someone can hack your site by purely using GET requests, than you probably deserved it.