Slashdot Mirror


Google Indexing In Near-Realtime

krou writes "ReadWriteWeb is covering Google's embrace of a system that would enable any Web publisher to 'automatically submit new content to Google for indexing within seconds of that content being published.' Google's Brett Slatkin is lead developer of PuSH, or PubSubHubbub, a real-time syndication protocol based on ATOM, where 'a publisher tells the world about a Hub that it will notify every time new content is published.' Subscribers then wait for the hub to notify them of the new content. Says RWW: 'If Google can implement an Indexing by PuSH program, it would ask every website to implement the technology and declare which Hub they push to at the top of each document, just like they declare where the RSS feeds they publish can be found. Then Google would subscribe to those PuSH feeds to discover new content when it's published. PuSH wouldn't likely replace crawling, in fact a crawl would be needed to discover PuSH feeds to subscribe to, but the real-time format would be used to augment Google's existing index.' PuSH is an open protocol, and Slatkin says that 'I am being told by my engineering bosses to openly promote this open approach even to our competitors.'"

6 of 79 comments (clear)

  1. Maybe I'm just a noob, but... by Pojut · · Score: 3, Interesting

    ...someone help me out here. People can still find my articles through google before I see the googlebot hit any new articles I post...how is that possible? How would my pages show up on google before the bot actually crawls them?

    1. Re:Maybe I'm just a noob, but... by garcia · · Score: 3, Interesting

      My site is by no means something high traffic but Googlebot indexes my pages (and shows them in search results) within three minutes:

      crawl-66-249-65-232.googlebot.com - - [04/Mar/2010:10:33:34 -0600] "GET /current-crime-decline-to-cause-public-safety-cuts HTTP/1.1" 200 47330 "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"

      I really don't see a need for something to be any more "real time" than that for someone's blog. Do you?

    2. Re:Maybe I'm just a noob, but... by K.+S.+Kyosuke · · Score: 4, Funny

      I have just found your test comment using Google.

      --
      Ezekiel 23:20
  2. kinda done now by hey · · Score: 4, Informative

    If google notices your site/blog updates frequently the bot will come around more often and especially if its a high page rank site.

  3. zen saying: by circletimessquare · · Score: 3, Funny

    "If a tree falls in the forest and no one is around to hear it, does it make a noise?"

    internet era update:

    "If a webpage is published on the web and no google spider notices it, does it exist?"

    near future update:

    "If a thought enters your mind that is not already indexed by google, is it real?"

    --
    intellectual property law is philosophically incoherent. it is your moral duty to ignore it or sabotage it
  4. I just noticed it yesterday. by 140Mandak262Jamuna · · Score: 3, Interesting
    Funny I just posted this yesterday in Pandas Thumb

    As usual I tried to make a tongue in cheek remark and ended up chewing my tongue. I meant Google’s indexer is so fast. Original posting was made at March 3, 2010 2:09 PM. It was in the index by March 3, 2010 5:08 PM. And it was not even from news.google.com, it is the general web search. Pretty soon Google will tell me that I’m out of milk even before I open the fridge door.

    --
    sed -e 's/Chuck Norris/Rajnikant/g' joke > fact