Slashdot Mirror


Google Caffeine Drops MapReduce, Adds "Colossus"

An anonymous reader writes "With its new Caffeine search indexing system, Google has moved away from its MapReduce distributed number crunching platform in favor of a setup that mirrors database programming. The index is stored in Google's BigTable distributed database, and Caffeine allows for incremental changes to the database itself. The system also uses an update to the Google File System codenamed 'Colossus.'"

3 of 65 comments (clear)

  1. Sounds inefficient by martin-boundary · · Score: 4, Interesting

    This sounds like it's going to be highly inefficient for nonlocal calculations, or am I missing something? Basically, if the calculation at some database entry is going to require inputs from arbitrarily many other database entries which could reside anywhere in the database, then the computation cost per entry will be huge compared to a batch system.

  2. Re:I have no idea by icebike · · Score: 4, Interesting

    Follow the link to the Original Article over on The Register , where you will find a rather lucid explanation, far better than the summary above can provide.

    Short answer:

    The old method of building their search database was essentially a Batch Job, Run it, wait, wait, wait a long time, swap results into production servers.

    The new method is continuous updates into a gigantic database spread over their entire network,

    This is why things show up in Google days, sometimes weeks ahead of the other search engines. The other guys are still trying to clone Google's old method.

    --
    Sig Battery depleted. Reverting to safe mode.
  3. Re:I have no idea by A+Friendly+Troll · · Score: 4, Interesting

    This is why things show up in Google days, sometimes weeks ahead of the other search engines.

    For a hands-on example of what icebike is saying, look here:

    http://www.google.com/search?q=%22This+is+why+things+show+up+in+Google+days%2C+sometimes+weeks+ahead+of+the+other+search+engines%22

    Actually, Google will index Slashdot comments in a matter of minutes.