Slashdot Mirror


WebCrawler Turns 10 Today

Brian Pinkerton writes "WebCrawler, one of the first search engines on the 'Net, turns 10 today. You can read a short history of WebCrawler. When I wrote WebCrawler, one could do a credible job of crawling, indexing, and searching the Web from a single desktop PC. Today, the reality is a little bit different."

10 of 136 comments (clear)

  1. Re:When did they give up.... by The+Bungi · · Score: 3, Informative
    There's MetaCrawler. If my memory serves me correctly, it appeared before WebCrawler went to this format.

    I honestly don't remember the first time I saw MetaCrawler (but it used to be much simpler back then!) so I don't know if it predates Google. WebCrawler's idea however is not new, AFAIK.

  2. Re:e-mailing results by Brianwa · · Score: 5, Informative

    You can be emailed results from Google as well.
    Simply email google@capeclear.com with the search terms in the subject line, you will soon recieve a response with the results. I think there is a limit to how many times a day you can use this, but I cannot find the link to the project webpage.

  3. Re:I remember using Webcrawler before google... by qodfathr · · Score: 5, Informative

    You are remembering raging.com, still up-and-running today.

    --
    Yes, it's true. This man has no dick.
  4. Re:The WebCrawler Search Voyeur by Anonymous Coward · · Score: 3, Informative

    It's still there in a slightly different incarnation.... http://www.metaspy.com

  5. Re:Then and now... by berenddeboer · · Score: 3, Informative
    Today only the largest websites can avoid a slashdotting with only 9 posts in the thread.

    Not true, see Surviving Slashdotting with a Small Server. Lots of people tried to bring it down (see comments), but it survived with no trouble at all.

    --
    If I had a sig, I would put it here.
  6. Re:Wow - the 1996 wayback WebCrawler page STILL WO by Bullet-Dodger · · Score: 2, Informative
    I have NO idea how that space got in there...

    Not your fault. Slashcode does that itself whenever there's a long enough unbroken string of characters, to stop page-widening posts.

  7. Re:public search engine by Anonymous Coward · · Score: 1, Informative

    yeah, it's called dmoz

  8. Hardly one of the first by btempleton · · Score: 4, Informative

    Internet searching way predates 1994. Archie by Peter Deutsch (the one from Montreal, not the American one) was one of the most popular applications on the internet in the 80s. The http search engines like Webcrawler and Lycos came much, much later on internet time scales.

    --
    Has it been over a year since you last donated to the Electronic Frontier Foundation
  9. nope by millette · · Score: 4, Informative
    You just need 8 desktop machines and you can index a 10th of what google does. From a recent article:
    Gigablast runs on eight desktop machines, each with four 160-GB IDE hard drives, two gigs of RAM, and one 2.6-GHz Intel processor. It can hold up to 320 million Web pages (on 5 TB), handle about 40 queries per second and spider about eight million pages per day. Currently it serves half a million queries per day to various clients, including some meta search engines and some pay-per-click engines.
    I also read it was going to expand it's index this year, but I wasn't able to find where I read that.
  10. Re:world wide worm? by Captain+Kangaroo · · Score: 2, Informative

    The WWWW (World-Wide Web Worm) pre-dated WebCrawler (and Jumpstation pre-dated it.) Jumpstation indexed only titles, while the Worm indexed both titles and anchor text (IIRC).