Using Graph Theory To Predict NCAA Tournament Outcomes
New submitter SocratesJedi writes "Like many technically-minded people, I don't have a lot of time to keep up with sports. Nevertheless, trying to predict the outcome of the NCAA men's basketball tournament is a fun activity to share with friends, family and colleagues. This year, I abandoned my usual strategy of quasi-randomly choosing teams and instead modeled the win-loss history of all Division I teams as a weighted network. The network included information from 5242 games played during the 2011-2012 season. From this, teams came be ranked using tools from graph theory and those rankings can be used to predict tournament outcomes. Without any a priori information, this method accurately identified all the #1 seeds in the top 5 best teams. It also predicts that at least one underdog, Belmont (#14 seed), will reach the Elite Eight. Although the ultimate test will be how well it predicts tournament outcomes, initial benchmarks suggest 70-80% accuracy would not be unreasonable."
wouldn't running the algorithm against past years' records and testing against past tournament results be the best possible test to tune the algorithm?
Everyone knows who the big names are who are likely to make it to the final four. It's predicting how things will go at the middle and bottom, where teams are much more likely to be evenly matched, that's really hard.
SJW: Someone who has run out of real oppression, and has to fake it.
And my numbers are off. In 2011, 43 times out of 63, the lower seed won for about a 68% win rate.
See my journal for slashdot ID's by year. Mine created in 2005. http://slashdot.org/journal/289875/slashdot-ids-by-year
That may work for pro sports, but not for college sports. In fact, because teams usually lose their nucleus after winning it all (players declare for the draft), it is rare for a team to make it to the final game two or more years in a row.
Some problems I see. Disclaimer: I know there's a margin of error here as the author said, and I know my observations will be based largely on anecdotal evidence, making it inferior. But if sports were so easy to predict there would be no sports gambling.
- That's probably too far for Belmont; a #14 has only ever gotten as far as the Sweet 16, twice (Cleveland State '86, Chattanooga '97). Lowest seed to make an Elite 8 is Missouri in 2002 as a #12 . Belmont is actually going to be one of the more popular upset picks, but they would have to upset two far superior teams twice in 3 days.
- It's a bit too "chalk". #1 seeds generally survive the first two games (undefeated against #16's, 55-14 v. #8's, 59-6 v. #9's), but the #2's have it worse (only four losses v. #15's, but 58-21 v. #7's and 29-21 v. #10's). I know two #12's, a #13 and a #14 doesn't seem like "chalk" but historically it's much more likely that we'll see more #5-7 or #10-11's. To have only one #2 not make the Elite 8 and all the #1's would be almost unheard of.
- A #12 always beats a #5, but three of them doing so in one year would seem unlikely, as they're only 39-89 overall.
- Some of the other first round matchups seem a bit improbably. It has every #6 and every #7 winning, for example.
You don't have time to follow sports, but you have time to model "information from 5242 games played during the 2011-2012 season".
You could be honest and just say you don't really care, but get involved in the playoffs because everyone else is talking about it.
I'm guessing your level 80 warlock probably doesn't 'have time' either. :)