Slashdot Mirror


WebCrawler Turns 10 Today

Brian Pinkerton writes "WebCrawler, one of the first search engines on the 'Net, turns 10 today. You can read a short history of WebCrawler. When I wrote WebCrawler, one could do a credible job of crawling, indexing, and searching the Web from a single desktop PC. Today, the reality is a little bit different."

55 of 136 comments (clear)

  1. Guess this celebration... by oberondarksoul · · Score: 5, Funny

    ...won't have an accompanying Google Doodle?

    --
    And tomorrow the stock exchange will be the human race
  2. e-mailing results by qewl · · Score: 4, Interesting

    Ah the nostalgia of receiving search results via e-mail :)

    --

    (\_/)
    (O.o) This is Bunny. (> <)
    1. Re:e-mailing results by Brianwa · · Score: 5, Informative

      You can be emailed results from Google as well.
      Simply email google@capeclear.com with the search terms in the subject line, you will soon recieve a response with the results. I think there is a limit to how many times a day you can use this, but I cannot find the link to the project webpage.

  3. Birthdays by 7Ghent · · Score: 4, Funny

    Happy birthday to Webcrawler AND Hitler! Hurray!

    1. Re:Birthdays by stevejsmith · · Score: 3, Funny

      ...and 4/20.

  4. They used to be my google.... by wo1verin3 · · Score: 5, Interesting

    I remember when webcrawler was the only search engine I touched...

    In 1996 it was nice and simple. Then as the time went on it got a bit too cluttered for my liking. Now looks like they're trying to googlize themselves with the current interface.

    1. Re:They used to be my google.... by Basehart · · Score: 4, Funny

      I really like their pantyhosecrawler companion site.

      Very cool research tool.

  5. Whoa! by outZider · · Score: 5, Interesting

    Holy crap!

    I remember WebCrawler, but lost touch with it in around 1996, when I started religiously using AltaVista. They sure have changed a bit. ... but do they have any relevance anymore? They're owned by InfoSpace. :P

    --
    - oZ
    // i am here.
  6. Wow by z0ink · · Score: 5, Interesting

    Does anybody else remember getting a WebCrawler promotional CD 10 years ago? I didn't even have a CD-ROM then!

    --
    Steal This Sig
  7. I remember using Webcrawler before google... by John+Seminal · · Score: 5, Insightful
    It was a good search engine. I dunno why I stopped using it, I think it was a bit on the slow side and Google had more pages.

    Heck, while reminiscing, I remember when excite was my start page, and when I used them for email. I remember they were the first "start" page to have groups. I stopped using them 4 years ago when their email stopped working.

    I guess if anything, we can learn the web is not going to be the same in 5 years as it is today. My question is, "is it better"? Personally, I think it was better back in the day. I would like to see a search engine that does not display any spam or sales or sex sites as hits. I now do most of my searches on google doing "search parameters site:edu".

    --

    Rosco: "If brains were gunpowder, Enos couldn't blow his nose."

    1. Re:I remember using Webcrawler before google... by Anonymous Coward · · Score: 5, Interesting

      I think I remember why I left WebCrawler.

      WebCrawler was simple and effective. But then AltaVista emerged. I started using AltaVista.digital.com, and from there WebCrawler went down hill - lots of advertising and junk that kind of made me hate it. What was once seemless and simple became noisy.

      I used AltaVista for a number of years, but once again advertising got the best of it. It turned super-sophisticated, with a lot of advertising fluff and "features". Altavista was becoming overly commercialized. They had a "simple" version that was better (I forget the name [begins with an "R"?]), but soon the result sets were scewed towards advertisers and abusers.

      In 2001, I made the switch to Google. It was everything that WebCrawler once was in terms of ease of use and quality of results. I've been more or less happy with Google ever since.

    2. Re:I remember using Webcrawler before google... by qodfathr · · Score: 5, Informative

      You are remembering raging.com, still up-and-running today.

      --
      Yes, it's true. This man has no dick.
  8. /me 's jaw hits the floor by Stalin · · Score: 3, Funny

    I can't believe it is even still around.

  9. Then and now... by jdreed1024 · · Score: 5, Funny
    When I wrote WebCrawler, one could do a credible job of crawling, indexing, and searching the Web from a single desktop PC. Today, the reality is a little bit different.

    No kidding. Back then, one could serve a website from most any machine, and it would be there for all to see. Today only the largest websites can avoid a slashdotting with only 9 posts in the thread.

    --
    There is no sig, there is only Zuul.
    1. Re:Then and now... by mph · · Score: 5, Funny
      Today only the largest websites can avoid a slashdotting with only 9 posts in the thread.
      Imagine how bad it would be if everyone actually read the articles.
    2. Re:Then and now... by berenddeboer · · Score: 3, Informative
      Today only the largest websites can avoid a slashdotting with only 9 posts in the thread.

      Not true, see Surviving Slashdotting with a Small Server. Lots of people tried to bring it down (see comments), but it survived with no trouble at all.

      --
      If I had a sig, I would put it here.
  10. Holy search engines batman! by ylikone · · Score: 3, Interesting

    Who uses webcrawler anymore? I didn't even know they still exist. Anybody remember opentext.com search?

    --
    Meh.
  11. Birthday party by jacobhoupt · · Score: 5, Funny

    I'll be hosting my tenth annual WebCrawler birthday party tonight in the back of my Yugo.

    Feel free to drop in, there should be plenty of seating available for those interested.

    --
    -- the only good thing the French ever did was two chicks at one time
  12. my new hero by theMerovingian · · Score: 4, Funny


    Some guys are too cool for their own good. Brian Pinkerton has the domain 'thinkpink.com', AND he wrote his own search engine.

    I bet he even has a 3-digit UID, a beowulf cluster of Xboxes running linux, and he sold all his stock options during the bubble. :)

    --
    "If you think you have things under control, you're not going fast enough." --Mario Andretti
  13. Already Slashdotted by MrRuslan · · Score: 3, Funny

    Here is the google cache
    http://216.239.39.104/search?q=cache:-vPR77Hq9OYJ: www.thinkpink.com/bp/WebCrawler/History.html+&hl=e n&ie=UTF-8

  14. When did they give up.... by David+Hume · · Score: 5, Interesting

    ...on their own web search technology and become a metasearch engine? From the WebCrawler About Page:

    WebCrawler uses innovative metasearch technology to search the Internet's top search engines, including Google, Yahoo, Ask Jeeves, About, Teoma, FindWhat, LookSmart, and many more.

    With one single click, WebCrawler searches the best results from the combined pool of the world's leading search engines -- instead of results from only one single search engine.

    And WebCrawler makes it easy to refine your search so you can find the most meaningful results right away. No wonder it's a leader in the search industry.


    Was it 2001? The History states:

    2001 InfoSpace acquires WebCrawler. Excite, now Excite@Home, went belly up. In the bankruptcy, Infospace acquired WebCrawler. Today Infospace runs WebCrawler as a meta-search engine. And they've given Spidey a new name and turned him purple!


    Oh, and if it is not being otherwise used, has the code for the WebCrawler spider been open-sourced? :)

    1. Re:When did they give up.... by The+Bungi · · Score: 3, Informative
      There's MetaCrawler. If my memory serves me correctly, it appeared before WebCrawler went to this format.

      I honestly don't remember the first time I saw MetaCrawler (but it used to be much simpler back then!) so I don't know if it predates Google. WebCrawler's idea however is not new, AFAIK.

  15. Boy, does this take me back... by Faust7 · · Score: 4, Interesting

    ...to the days when the search engine market resembled the microcomputer market of the '80s. Several competitors, all with (roughly) the same market share, each with a certain number of hits that the others didn't have. I had to use at least a few of them to assure myself that I was getting something reasonably close to what the whole Web could offer on my search topic (even though no search engine comes close to penetrating all of the pages out there).

    If I was looking for something, I'd query Lycos, AltaVistas, Infoseek, Excite, Webcrawler, and Magellan. And, later on, Google. Vastly different results, site designs, site objectives. I won't say it was the most streamlined, elegant experience, but it was kind of fun.

  16. Oh, Yahoo too. by Faust7 · · Score: 3, Interesting

    Oh yeah, and Yahoo as well. Forgot to include them.

    Interestingly, their look has changed very, very little from their olden days.

    1. Re:Oh, Yahoo too. by sflory · · Score: 2, Interesting

      Yahoo in the past has never done their own search engine. They've used a number of backends including google. This is has been true up until they aquired Inktomi. Late last year they launched Yahoo search using Inktomi's search engine.

      --
      IANALBIPOOGL (I am not a Lawyer, but I play one on GrokLaw.)
  17. Wow. Just. Wow. by TWX · · Score: 5, Interesting

    I remember using Webcrawler back when I got my first 14.4 Slirp connection back in 1994. It was the only way to search!

    and then came the marvels of altavista.digital.com.

    I'm so glad that google came along...

    --
    Do not look into laser with remaining eye.
  18. Wow - the 1996 wayback WebCrawler page STILL WORKS by Anonymous Coward · · Score: 5, Interesting

    http://web.archive.org/web/19961023234707/http://w ww.webcrawler.com/

    Presumably connects to the current crawler which still accepts the old format :)

    --
    Callas

  19. Re:Wow - the 1996 wayback WebCrawler page STILL WO by Anonymous Coward · · Score: 2, Interesting

    1996 WebCrawler

    I have NO idea how that space got in there...

    --
    Callas

  20. First query? by Xzzy · · Score: 4, Funny

    So who remembers the first search query they typed into Webcrawler?

    I was just crawling out of the gopher world, a short period where I was getting turned on to the web but there was no way to find links, almost everything came through the university homepage or word of mouth. Then someone pointed me to webcrawler.

    What did I search for first? "fart jokes". No kidding.

    "boobs" was second.

  21. Well isn't that ironic by Zygote-IC- · · Score: 5, Funny

    So, to read a story celebrating an anniversary about a search engine, we have to go through the cache of another search engine?

    Go figure.

    1. Re:Well isn't that ironic by Adam9 · · Score: 2, Insightful

      No worries, just go here

  22. WebCrawler on NeXTStep - before Open Source by ben_kelley · · Score: 5, Interesting

    It is scary to think that at one point I e-mailed the WebCrawler people to ask them how it worked. In response they sent me a copy of the source (Objective C for NeXT) so I could compile it up on my NeXT PC (I had a "black" NeXT - 68000 based) to index my intranet web server.

    I doubt that someone like Google would send you a copy of their source these days - even if you asked nicely.

    I could never get it to compile, and I deleted it long ago, but I kind of wish I had kept it now. An interesting piece of internet history.

    1. Re:WebCrawler on NeXTStep - before Open Source by houseofmore · · Score: 3, Funny

      Ah it's just a perl module I think. Google::Search or something or other...

    2. Re:WebCrawler on NeXTStep - before Open Source by ArbitraryConstant · · Score: 2, Insightful

      I doubt that someone like Google would send you a copy of their source these days - even if you asked nicely.

      The next best thing.

      search appliance

      --
      I rarely criticize things I don't care about.
    3. Re:WebCrawler on NeXTStep - before Open Source by sacremon · · Score: 5, Interesting

      I sometimes sit back and think about some of the various projects that first saw life on the NeXT platform:

      the first web server
      Webcrawler
      Doom and DoomII

      Pretty good for a machine that only sold ~70,000 units total, not including the versions of NEXTSTEP for ix86/SPARC/PA-RISC.

      I still have a Color NeXTStation stashed away in a closet. I was using it as a print server till about two years ago.

      --
      If you can't beat them, embrace and extend them.
  23. Worth Remember by Anonymous Coward · · Score: 2, Funny

    I've reminisced before on slashdot about the beautiful geeky girl that introduced me to hotbot. Glasses, long blond hair, full breasts... cute sandals, short skirts... those silk panties.

    Fuck WebCrawler. hotbot.

  24. The WebCrawler Search Voyeur by Faust7 · · Score: 5, Funny

    Anyone remember the WebCrawler Search Voyeur?

    It was a little Java applet that sat on your screen and displayed the pseudo-real-time search queries of other people.

    When I was a computer lab monitor at my college, we used to note in the log book any particularly amusing queries that we'd seen.

    "hairy woman"... "squirrel torture"... "tom AND cruise AND foot AND odor"... "asian girl underage spanking"...

    1. Re:The WebCrawler Search Voyeur by intangible · · Score: 4, Funny

      I remember that it wouldn't show every search of course, but you could verify it was working by searching for the same phrase over and over again. About 10 seconds later, you could see your search phrase. You could actually use it to communicate with other people, albeit a little slow, but it was amusing. I would type in silly things just so others watching the voyeur would see them.
      I bet you guys recorded some of my stuff :P

    2. Re:The WebCrawler Search Voyeur by Anonymous Coward · · Score: 3, Informative

      It's still there in a slightly different incarnation.... http://www.metaspy.com

  25. public search engine by jacquesm · · Score: 4, Interesting

    I'd happily contribute cash to a publicly funded and publicly run search engine.

    Anyone game ?

    1. Re:public search engine by Chess_the_cat · · Score: 2, Funny

      Wait until Google goes public and buy stock.

      --
      Support the First Amendment. Read at -1
  26. I remember the exact day by Anonymous Coward · · Score: 2, Funny

    The exact day that I stopped using webcrawler. It happened to coincide with the day that AOL was announced as the new owner.

  27. I didn't even know they were around anymore. by Captain+Rotundo · · Score: 2, Interesting

    And the odd part is I don't even remember the interface being as cluttered as the very early one linked through the archive in an earlier post. I suppose I moved on very early, although I remember when as far as I was concerned they were the only game in town.

  28. Takes me back by SuperBigGulp · · Score: 3, Interesting

    I remember using WebCrawler on my very first SLIP dial up account and thinking "How cool is this?" I had used AOL for a couple years prior but was hoping trade in their UI (and limitations) for Netscape. The funny thing is that I wasn't sure if I could find enough content on the web.

    Also a great testament to the original design and concept that search engines still look and work a lot like WebCrawler, 10 years on.

    Happy birthday, and thanks for the walk down 32K memory lane

    --
    Someday a Slashdot ID of 177180 will mean something.
  29. Re:OT, your sig by antic · · Score: 2, Funny

    If I updated the sig, would as many people click it?

    --
    'Thats they exact same thing a banana wrench monkey.'
  30. The more things change... by Old+Man+Kensey · · Score: 3, Interesting
    Originally there was WebCrawler (among others). In late 1996, AOL acquired WebCrawler and turned it into AOL Netfind. Later, apparently, Excite bought it from AOL, made it a separate service, and Excite became the engine that powered AOL Netfind. After that apparently InfoSpace bought it in the Excite sell-off.

    But after AOL bought it I lost track of it, because it started sucking (returning lots more stale links than before), and altavista.digital.com burst upon the scene (anyone else remember "kayak sailing San Juan islands"?).

    My guess would be that the meta-search switch initially happened when Excite bought them.

    --
    -- Old Man Kensey
  31. Re:Wow - the 1996 wayback WebCrawler page STILL WO by Bullet-Dodger · · Score: 2, Informative
    I have NO idea how that space got in there...

    Not your fault. Slashcode does that itself whenever there's a long enough unbroken string of characters, to stop page-widening posts.

  32. Hardly one of the first by btempleton · · Score: 4, Informative

    Internet searching way predates 1994. Archie by Peter Deutsch (the one from Montreal, not the American one) was one of the most popular applications on the internet in the 80s. The http search engines like Webcrawler and Lycos came much, much later on internet time scales.

    --
    Has it been over a year since you last donated to the Electronic Frontier Foundation
  33. The one before WebCrawler? by SnappingTurtle · · Score: 2, Insightful
    I seem to remember that before WebCrawler there was actually a "big" search engine run by a non-profit. For the life of me I can't remember what it was, but I seem to remember one day going "Wow, this webcrawler thing is great, I'm never touching [whatever] again."

    Of course a few years later I said "Wow, this AltaVista thing is great. I'm never touching WebCrawler again." And then I went "Wow, this Google thing is great. I'm never touching AltaVista again."

    --
    I've found that my posts don't format quite right w/o a sig.
  34. Wow... by }InFuZeD{ · · Score: 3, Interesting

    I think WebCrawler was my first search engine ever...

    From there I graduated to MetaCrawler, which parsed WebCrawler and all the other currently popular web search engines at the time.

    For some reason or another MetaCrawler started sucking and I used InfoSeek for quite some time... then they were acquired by Go.com and it went downhill from there.

    I remember what I'd search the internet for back in those days tho. It was always "jedi knight" and "giga pets" (remember those cute tamagotchi rip-offs? =p)

  35. More WebCrawler History by Anonymous Coward · · Score: 2, Interesting

    I used to be one of the Excite@Home engineers who looked after Webcrawler. WebCrawler and the Excite front end all belonged to the same code base called My Excite Start Page (known internally as MESP at Excite).

    The WebCrawler at Excite was pretty much an unsupported product when I was there. All I ever did were maintenance releases, never any new stuff for WebCrawler. WebCrawler was actually the Excite front end, except it had the WebCrawler logos instead of Excite.

    The search engine was the Excite search engine as well as all links on the front page pointed to Excite media properties.

    On a personal note, I was always somewhat saddened to see a piece of Internet history neglected the way it was but by December 2001, Webcrawler received very little traffic (I forget the numbers).

  36. WebCrawler Sale Sensation by Anonymous Coward · · Score: 4, Interesting

    WebCraweler's Brian Pinkerton formerly worked at NeXT, and I remember being in the the NeXT kitchen when news arrived in 1995 of his sale of WebCrawler to AOL. The sale price was around $1 million, and everyone was absolutely awed that a software company could sell for so much. This marked the beginning for me of the dot-com era: Just a few month later, other companies started or run by ex-NeXTers sold for millions, then tens of millions, and at least one for hundreds of million. Soon after that, NeXT CEO Jobs took Pixar through an IPO, for a personal gain of about $1 billion!

  37. Ah... back in the day by StefanSavage · · Score: 3, Interesting

    I remember back in 1994 WebCrawler was running on three machines in the corner of Sieg Hall 433. They were rigged up so one could reboot the others via a serial line, but occassionally that machine would crash too. That was when Brian would call in and say "Hey, Webcrawler is hung. Could you go reboot it?". I'm guessing this doesn't happen much at Google...

  38. nope by millette · · Score: 4, Informative
    You just need 8 desktop machines and you can index a 10th of what google does. From a recent article:
    Gigablast runs on eight desktop machines, each with four 160-GB IDE hard drives, two gigs of RAM, and one 2.6-GHz Intel processor. It can hold up to 320 million Web pages (on 5 TB), handle about 40 queries per second and spider about eight million pages per day. Currently it serves half a million queries per day to various clients, including some meta search engines and some pay-per-click engines.
    I also read it was going to expand it's index this year, but I wasn't able to find where I read that.
  39. Re:world wide worm? by Captain+Kangaroo · · Score: 2, Informative

    The WWWW (World-Wide Web Worm) pre-dated WebCrawler (and Jumpstation pre-dated it.) Jumpstation indexed only titles, while the Worm indexed both titles and anchor text (IIRC).