Choose a Better Train With Web Scraping (hackaday.com)

Posted by Soulskill on Friday December 4, 2015 @09:02AM from the it's-never-on-time-when-you-need-it-to-be dept.

szczys writes: Tired of his trains being constantly late, Eric Evenchick headed to the Via Rail (Canada's communter train service) website to find which trains had a better on-time rate. Unfortunately they only offer three days worth of data through the dropdown selections — but a bit of investigating showed the GET requests were open for about the last six months. Evenchick built a web-scraper with Python, along with a web interface that queries the resulting SQL db. The harvested data shows system-wide delays that average more than twelve minutes (mostly due to commercial rail having the right-of-way). The good that comes of this? You can now choose your train based on smallest likelihood of delay..

2 of 50 comments (clear)

Min score:

Reason:

Sort:

Violating ToS? by Anonymous Coward · 2015-12-04 09:07 · Score: 3, Insightful

Check the site's terms of service, scraping site contents may be in violation of the ToS.
I wrote a similar app about 15 years ago to scrape the Edmonton Transit System's route schedules (conveniently posted in generally well structured HTML at the time) so I could build a relational system and try and sort out predictive routes / times. Then I found out what I was doing was in violation of their ToS, I stopped my scraping service immediately (before getting called on it).
Guy Writes Script by Anonymous Coward · 2015-12-04 09:30 · Score: 2, Insightful

So a guy wrote a script. Good for him, I guess, but why is this on /.?