Slashdot Mirror


How To Build a Web Spider On Linux

IdaAshley writes, "Web spiders are software agents that traverse the Internet gathering, filtering, and potentially aggregating information for a user. This article shows you how to build spiders and scrapers for Linux to crawl a Web site and gather information, stock data, in this case. Using common scripting languages and their collection of Web modules, you can easily develop Web spiders."

3 of 104 comments (clear)

  1. ^sh1t by Anonymous Coward · · Score: -1, Troll

    fate. Let's 8ot be

  2. User-Agent by Joebert · · Score: 1, Troll

    They forgot the set the User-Agent header to IE.

    --
    Wanna fight ? Bend over, stick your head up your ass, and fight for air.
  3. Re:Hmm... by Anonymous Coward · · Score: -1, Troll

    Yes, it runs on every evironment I know: debian, suse, gentoo, fedora etc.. you just need an installed ruby and python interpreter. what's funny about that comment?