Slashdot Mirror


Building a Bigger Search Engine

skreuzer writes "Wired is running a story about a distributed web crawler called Grub. People who choose to download and run the client will assist in building the Web's largest, most accurate database of URLs. This database will be used to improve existing search engines' results by increasing the frequency at which sites are crawled and indexed. Conceivably, Grub's distributed network could enable state information to be gathered on every document on the Internet, each and every day."

7 of 278 comments (clear)

  1. Biiig questions to answer by andy@petdance.com · · Score: 5, Interesting
    So Grub goes out, uses bandwidth, and then returns some results to the home base. It's really distributed bandwidth more than distributed computation.

    I bet one of the big successes in Folding and distributed.net is that many people run the clients on work boxes, knowing that there's little actual overhead incurred to their work. How different that is for a URL sucker.

    I wonder what broadband ISPs think of Grub.

  2. Google Toolbar by petree · · Score: 5, Interesting

    Couldn't google do this anyways with the google toolbar? Cause with the advanced features version it tracks every page you visit. If they offered some incentive to install the toolbar, google could just beat them at this game. I actually use the google toolbar already by choice (it makes my web searching more productive) everyday, all they have to do is get lots of people using it and wouldn't that work just as well or better?

    1. Re:Google Toolbar by Kelerain · · Score: 5, Interesting

      This tracking is actually how a lot of important information leaks out. Security through obscurity has always been a poor mans system, and this busts it wide open. I wont post them here but there are several interesting searches you can do that give personal results for things that REALLY have NO place on a publicly accessable page. On a more positive note, google already uses distributed computing though thier googlebar http://toolbar.google.com/dc/offerdc.html However they donate the cycles to various worthy causes like folding at home (currently thier only benificiary), but it is concevable that if they came up with some secure and usefull search related thing to do with the cycles they could put it to use almost instantaniously. I think that there aren't segnificant benifits (plenty of discussion elsewhere here) for them to want to use it however.

  3. Re:Not news for us webmasters by Redwing · · Score: 5, Interesting

    Here is what slashdotters were saying about grub almost 2 years ago.

    --
    Raisinettes are my raison d'etre
  4. Re:search.msn.com is the future by shibbydude · · Score: 5, Interesting
    In particular, the company has its own team of editors that monitors the most popular searches being performed and then hand-picks sites that are believed to be the most relevant.

    You have to be kidding or working for Microsoft, or both! Have you ever searched for Linux on MSN? Try it - here.

    Notice the third result? "Learn about the Microsoft alternatives and how to move to them from open source products." I shit you not! I don't think Google would ever use this kind of dirty, underhanded trick. Great "hand-picking", mate.

    --
    We're only gonna die from our own arrogance, that's why we might as well take our time...
  5. Re:Will Grub take off or be smashed? by dtfinch · · Score: 5, Interesting

    There are many ways to look at this. The idea is to install the client, set Opera to use the same useragent string, visit some of those sites, then blame it on Grub if the FBI comes busting through your door.

    If you're a criminal, installing the Grub client might be a great idea.

  6. Re:Great idea, but will it pan out? by Nickilo · · Score: 5, Interesting

    "The General's Dilemma" would solve this problem. The story goes something like this: The general needs to get urgent information to one of his officers, however, he suspects saboteurs are present among his messengers. In order to insure the information gets through accurately, he sends the same message with several men. The officer on the other end collects all the messages and goes with the majority. (And, presumably, kills the others.)