Slashdot Mirror


How To Build a Web Spider On Linux

IdaAshley writes, "Web spiders are software agents that traverse the Internet gathering, filtering, and potentially aggregating information for a user. This article shows you how to build spiders and scrapers for Linux to crawl a Web site and gather information, stock data, in this case. Using common scripting languages and their collection of Web modules, you can easily develop Web spiders."

1 of 104 comments (clear)

  1. User-Agent by Joebert · · Score: 1, Troll

    They forgot the set the User-Agent header to IE.

    --
    Wanna fight ? Bend over, stick your head up your ass, and fight for air.