Slashdot Mirror


Choose a Better Train With Web Scraping (hackaday.com)

szczys writes: Tired of his trains being constantly late, Eric Evenchick headed to the Via Rail (Canada's communter train service) website to find which trains had a better on-time rate. Unfortunately they only offer three days worth of data through the dropdown selections — but a bit of investigating showed the GET requests were open for about the last six months. Evenchick built a web-scraper with Python, along with a web interface that queries the resulting SQL db. The harvested data shows system-wide delays that average more than twelve minutes (mostly due to commercial rail having the right-of-way). The good that comes of this? You can now choose your train based on smallest likelihood of delay..

2 of 50 comments (clear)

  1. Violating ToS? by Anonymous Coward · · Score: 3, Insightful

    Check the site's terms of service, scraping site contents may be in violation of the ToS.

    I wrote a similar app about 15 years ago to scrape the Edmonton Transit System's route schedules (conveniently posted in generally well structured HTML at the time) so I could build a relational system and try and sort out predictive routes / times. Then I found out what I was doing was in violation of their ToS, I stopped my scraping service immediately (before getting called on it).

  2. Guy Writes Script by Anonymous Coward · · Score: 2, Insightful

    So a guy wrote a script. Good for him, I guess, but why is this on /.?