Slashdot Mirror


How To Build a Web Spider On Linux

IdaAshley writes, "Web spiders are software agents that traverse the Internet gathering, filtering, and potentially aggregating information for a user. This article shows you how to build spiders and scrapers for Linux to crawl a Web site and gather information, stock data, in this case. Using common scripting languages and their collection of Web modules, you can easily develop Web spiders."

6 of 104 comments (clear)

  1. Hmm... by joe_cot · · Score: 5, Funny

    Yes, but does it run on ... damn.

  2. The 90s called by dave562 · · Score: 5, Funny

    They want their technology back.

  3. Re:yes, I did RTFA by Faylone · · Score: 4, Funny

    You RTFA? Are you sure you're in the right place?

  4. It's a trap! by radu.stanca · · Score: 2, Funny

    Ah, I can see it clearly now!

    1. Post to Slashdot a decoy article(it includes Linux in the subjest) with new spam tricks
    2. Watch if spam increases 30% next days
    3. Bribe Cowboy Neal with 10G midget lesbian pr0n and get IP adresses of the art. readers
    4. Load shotgun and make the world a better place!

  5. Re:Obligatory by k33l0r · · Score: 2, Funny

    Has there ever been a news story on Slashdot that doesn't have a "I, for one, welcome our new [Insert here] overlords" comment attached to it?

  6. Re:Crawling efficiently by Zonk+(troll) · · Score: 2, Funny
    Maybe because they don't know the first thing about efficiency? You'd be surprised how much programmers don't know/care about efficiency.


    If you're surprised about programmers not knowing/caring about efficiency, do you actually use a computer?
    --
    "The Federal Reserve is a fraudulent system."--Lew Rockwell
    End The FED. -