WebCrawler Turns 10 Today

← Back to Stories (view on slashdot.org)

Posted by ryuzaki0 on Tuesday April 20, 2004 @11:10AM from the young-at-art dept.

Brian Pinkerton writes "WebCrawler, one of the first search engines on the 'Net, turns 10 today. You can read a short history of WebCrawler. When I wrote WebCrawler, one could do a credible job of crawling, indexing, and searching the Web from a single desktop PC. Today, the reality is a little bit different."

10 of 136 comments (clear)

Min score:

Reason:

Sort:

Re:When did they give up.... by The+Bungi · 2004-04-20 11:46 · Score: 3, Informative

There's MetaCrawler. If my memory serves me correctly, it appeared before WebCrawler went to this format.
I honestly don't remember the first time I saw MetaCrawler (but it used to be much simpler back then!) so I don't know if it predates Google. WebCrawler's idea however is not new, AFAIK.
Re:e-mailing results by Brianwa · 2004-04-20 12:03 · Score: 5, Informative

You can be emailed results from Google as well.
Simply email google@capeclear.com with the search terms in the subject line, you will soon recieve a response with the results. I think there is a limit to how many times a day you can use this, but I cannot find the link to the project webpage.
Re:I remember using Webcrawler before google... by qodfathr · 2004-04-20 12:03 · Score: 5, Informative

You are remembering raging.com, still up-and-running today.

--
Yes, it's true. This man has no dick.
Re:The WebCrawler Search Voyeur by Anonymous Coward · 2004-04-20 12:49 · Score: 3, Informative

It's still there in a slightly different incarnation.... http://www.metaspy.com
Re:Then and now... by berenddeboer · 2004-04-20 13:03 · Score: 3, Informative

Today only the largest websites can avoid a slashdotting with only 9 posts in the thread.

Not true, see Surviving Slashdotting with a Small Server. Lots of people tried to bring it down (see comments), but it survived with no trouble at all.

--
If I had a sig, I would put it here.
Re:Wow - the 1996 wayback WebCrawler page STILL WO by Bullet-Dodger · 2004-04-20 13:09 · Score: 2, Informative

I have NO idea how that space got in there...
Not your fault. Slashcode does that itself whenever there's a long enough unbroken string of characters, to stop page-widening posts.
Re:public search engine by Anonymous Coward · 2004-04-20 13:22 · Score: 1, Informative

yeah, it's called dmoz
Hardly one of the first by btempleton · 2004-04-20 13:22 · Score: 4, Informative

Internet searching way predates 1994. Archie by Peter Deutsch (the one from Montreal, not the American one) was one of the most popular applications on the internet in the 80s. The http search engines like Webcrawler and Lycos came much, much later on internet time scales.

--
Has it been over a year since you last donated to the Electronic Frontier Foundation
nope by millette · 2004-04-20 19:01 · Score: 4, Informative

You just need 8 desktop machines and you can index a 10th of what google does. From a recent article:
Gigablast runs on eight desktop machines, each with four 160-GB IDE hard drives, two gigs of RAM, and one 2.6-GHz Intel processor. It can hold up to 320 million Web pages (on 5 TB), handle about 40 queries per second and spider about eight million pages per day. Currently it serves half a million queries per day to various clients, including some meta search engines and some pay-per-click engines.
I also read it was going to expand it's index this year, but I wasn't able to find where I read that.
Re:world wide worm? by Captain+Kangaroo · 2004-04-21 07:18 · Score: 2, Informative

The WWWW (World-Wide Web Worm) pre-dated WebCrawler (and Jumpstation pre-dated it.) Jumpstation indexed only titles, while the Worm indexed both titles and anchor text (IIRC).