Slashdot Mirror


Computer System Makes Best Sports Bets

schliz writes to tell us that a new computer system using the "Logistic Regression Markov Chain" (LRMC) has proven to be the most efficient system at predicting sporting event outcomes. The system was tested on the 2008 US NCAA basketball season and picked all four of the finalists. "Similar to other rankings systems, LRMC uses the quality of each NCAA team's results and the strength of each team's schedule to rank teams. The method has been designed to use only basic scoreboard data, including which teams played, which team had home court advantage and the margin of victory."

73 comments

  1. For the first time in a while... by insertwackynamehere · · Score: 5, Insightful

    The final four were also all #1s in their league. Coincidence? This has never happened before I believe and if the computer calculates odds the way the teams are ranked, then this may not always be so reliable.

    1. Re:For the first time in a while... by cpricejones · · Score: 1

      I wonder if the computer picked Davidson going to the Elite 8 ...

    2. Re:For the first time in a while... by Anonymous Coward · · Score: 0

      if the computer calculates odds the way the teams are ranked, then this may not always be so reliable. Relative to what?

      Saying something isn't reliable is not very insightful unless you have some valid comparison to use. I could say anything is not reliable, but justifying that claim is a lot harder. For example, I could say hurricane prediction is not reliable. But once I started to make comparisons I would realize that it is much more reliable than it was 20 years ago.

      The irony of your critique is that the computer picked the most reliable conventional rating--team rankings--and it got them right! I'm not a betting man, but if I were I would rather put my bet on a #1 ranked team to get into the final four than for a #2 or lower. I wouldn't expect all of my bets to be correct, but that doesn't mean I would start moving my bets to lower ranked positions--that would be stupid, statistically.
    3. Re:For the first time in a while... by Anonymous Coward · · Score: 0

      the bot picks its own rankings based on game statistics. it doesn't pick teams based on ranks. nice to see tl;dr gets modded insightful on /.

    4. Re:For the first time in a while... by langelgjm · · Score: 1

      From TFA:

      By identifying the 30 of the 36 Final Four participants in the past nine years of tournaments, the method has achieved 83 percent accuracy. In comparison, the seedings and polls have correctly identified only 23 of the 36 NCAA Final Four participants in the same nine year stretch, and the currently used Ratings Percentage Index (RPI) formula identified 21.
      --
      "Anyone who [rips a CD] is probably engaging in copyright infringement." - David O. Carson
    5. Re:For the first time in a while... by insertwackynamehere · · Score: 1

      So in other words you're saying you think the computer just sorted by seed? That's what I did because I didn't follow the season. From my perspective picking upsets is dumb since I ONLY know the seed (I don't follow teams enough or sports enough to be able to predict upsets with any ability). My point simply was, IF the computer is going to consistently agree with the seeds for the Final Four, then this happens to be the one year where it worked. I have nothing to compare it to, and I am not saying this is the case. I was simply pointing out that the 4 number 1s have never been in the final four before. Maybe the computer figured out that those teams were, specifically this year, final four material and the algorithm is legitimate. Or maybe the machine would generally pick the 1 seeds because of its algorithm and got lucky this year since it's the first time for it to ever happen. I'm not claiming to know, I'm just bringing it up because I feel it's worth mentioning. Maybe I'm completely wrong, but I am not trying to prove anything, just start discussion on it's possibility.

    6. Re:For the first time in a while... by insertwackynamehere · · Score: 1

      Yeah but how are the ranks picked? I'm saying it came to the same conclusion which may not always be true. But like I said in an above post, maybe I'm wrong and the machine is more legit than that. I'm not claiming anything either way, I was simply bringing up the fact that the 4 #1 seeds were all final fours this year for the first time ever and how if the machine calculates rank, maybe its similar to how NCAA does?

  2. Yeah well... by Corpuscavernosa · · Score: 1
    I picked all 4 #1 seeds as well. Sure it's never happened before but the odds have got to be better than trying to pick which numbered seed is going to get in from each division.

    The real test would be to look at the rest of the computer's bracket.

    --
    We figured out a long time ago that it's easier to elect seven judges than to elect 132 legislators.
    1. Re:Yeah well... by peragrin · · Score: 1

      exactly, the whole bracket should be looked atto see just how good the system is.

      As for the final four It wouldn't surprise me to see the next several years only #1 and #2 seed teams make up the final four, the rest of the field is a joke. There is no real competition for the top 1 and 2 seeded teams.

      --
      i thought once I was found, but it was only a dream.
    2. Re:Yeah well... by grogdamighty · · Score: 1

      No real competition? Sure, all of the #1s made it to the Final Four for the first time ever. But two #2s lost in the second round (Duke and Georgetown), one in the third (Tennessee), and one in the Elite 8. So at most, if you give a little leeway in the 2-3 game that's supposed to happen in the Sweet 16, half of the #2s performed to expectation. The other half didn't even make it to the second weekend.

      --
      My other sig is funny.
  3. Making sports bets by Z00L00K · · Score: 4, Interesting
    Is always a question of statistics with a random noise involved.

    The amount of noise involved strongly depends on which sport that is involved. Basket is a sport where a lot of points is scored, which in turn means that the noise is relatively low while football (what americans call soccer for some strange reason and what americans call football is more like rugby) has a lot of noise since the ability to score a goal there is depending a lot on luck.

    This essentially means that counting points is a good way to score a basketball team while counting goals won't give much clue to how good a given football team is. You must look at other factors on a football team instead. And not all those factors can be as easily measured. Of course - the other factors are also important for a basket team. Other factors involved are the composition of players, individual player mood/health/inspiration, latest matches, history between the teams, referee behavior, weather, spectators, location, timezone etc. Add to this the element of randomness caused by the impact of the ball on a surface, player positions at certain points of the game etc.

    --
    If builders built buildings the way programmers wrote programs, then the first woodpecker would destroy civilization.
    1. Re:Making sports bets by Anonymous Coward · · Score: 0

      I'm sorry, but I have no idea what you're going on about. Yes, there's always a question of "random noise" but what does that have to do with bringing up footballl? And why bring up that Americans call it soccer and go on about American football and rugby? Why bring up football at all? It makes you look like such a prat. Oh no, they brought up something American, quick! I must counter it with football!

      Anyways,there are what, 58 games played by Man United over 9 months? You want to talk about noise? Every NBA team will play 82 games in 4 1/2 months. Each team will play every other team in its division four times (2 at home, 2 away) as well as playing every team in their conference at least 3 times. The fact that so many games are played over such a short period of time (40% more games in half the time of a soccer club) means that its more of a slog than football. Its as much a mental grind as it is a physical grind. Hop a plane to Seattle, get to your hotel, practice, play the game, sleep, wake up the next morning, hop a plane to Sacramento, practice, play a game...lather rinse repeat. And of the things you've mentioned, weather is the only difference between basketball and football. Everything else is the same: fans, rivalries, surfaces, player position, etc.

      Going back to collegiate basketball, the reason why its higher scoring is because there's a shot clock. You get 35 seconds per possession to score. Basketball is simply structured differently from football, so there are possessions and really a guaranteed amount of time to execute a play. And saying there's more "luck" involved in soccer is bollocks. Upsets happen in basketball as well. Siena (#13) laid the smackdown on Vanderbilt (#4). Or Villanova over Clemson. Or Davidson (#10) over Gonzaga (#7), Georgetown (#2) and Wisconsin (#3).

      And scoring alone is hardly a good metric to measure a collegiate (or professional) basketball team. Certain teams (Washington State and Wisconsin come to mind) are defense oriented teams. So, the number of possessions decrease in games with teams like that. So, you're more likely to see scores like 54-65 or 48-56. Does that mean because Wisconsin only scored 52 points and not a 98 points its a bad team? Wisconsin scored fewer points in wins over Cal State and Kansas State, 71 and 72 respectively than Mount Saint Mary's scored in losing to North Carolina (72). Does that mean Wisconsin is worse team than Mount Saint Mary?

      Its pretty obvious you're not a fan of a basketball, collegiate or professional, so why contribute something that's off-topic? It even states in the synopsis that strength of schedule is used as well as home court advantage. On any given day in any given sport an underdog can win. Otherwise, why would we play?

    2. Re:Making sports bets by Mr.+Underbridge · · Score: 4, Insightful

      Is always a question of statistics with a random noise involved.
      The amount of noise involved strongly depends on which sport that is involved. Basket is a sport where a lot of points is scored, which in turn means that the noise is relatively low while football (what americans call soccer for some strange reason and what americans call football is more like rugby) has a lot of noise since the ability to score a goal there is depending a lot on luck.

      American football, over the course of a full game, has coarse scoring jumps (7pts for a touchdown) but luck plays a surprisingly small role. This is why good teams have very high winning percentages and poor ones have such low winning percentages. Not sure how that dynamic works in futbol, but the luck factor isn't as large as you'd think.

      The reason the LRMC method is well-suited to NCAA basketball is that A) there are a lot of games, and B) the good conferences don't play the bad ones much. That means that a high-order Markov model is a good way to determine who would beat whom through a game of "I beat a team that beat a team that beat a team that beat you" sort of thing.

      I came up with a version of this independently before I stumbled over these guys last year. It's pretty fun and works quite well. It's certainly much better than the polls, and in most cases last year my system was within a point or two of the Vegas spread. It's also pretty good at recognizing underdogs early - mine had Davidson and Drake before they were in the polls.

    3. Re:Making sports bets by drooling-dog · · Score: 2, Interesting

      Several years ago I was playing with some iterative and least squares approaches to predicting (American) football scores and rating teams. It worked pretty well, but one thing stood out: When you use only the scores from previous games and home/visiting status as inputs to the model, you hit a pretty hard floor of about 2 touchdowns (13 or 14 points) for your standard error. That error includes the "hidden variables" that you mention, as well as the fundamental randomness of the game.

      It also implies that any statistical predictions you do are going to be off by 7 or more points 62% of the time, 14 or more about 32% of the time, and 28 or more about 5% of the time. That's worth considering when betting against a spread...

    4. Re:Making sports bets by ZombieWomble · · Score: 1
      The reason the GP brings up noise in the context of the different games is an implicit assumption that the "noise" in results is such that it follows Poisson statistics - in general, if you assume you're sampling some "true" probability of scoring/winning, the odds of getting a result which significantly deviates from the true value is smaller if the number of samples is higher. Consider how the ratio of heads to tails in coin tosses converges towards the mean as you perform more tosses as an example. Thus the GP assumes that more points scored and more games played constitutes more samples, hence yielding a much more accurate measurement of the quality of the team than would be obtained from a game with relatively lower absolute scores and fewer matches like football.

      Of course, I wouldn't necessarily be convinced that the statistics are that much better behaved in basketball as opposed to other games, due to the variety of confounding factors you mentioned (which are fairly constant between most sports) which may not necessarily be represented in a simple scoreline.

    5. Re:Making sports bets by Dr.+Winston+O'Boogie · · Score: 1

      Football is as random as basketball (or any other sport). You can win by 6 points as easily as lose by 6 points *if* the two teams are equally matched (on that given day). When teams are not equally matched, then randomness does not play much of a role, which is why good teams over the course of a season have good records and bad teams have bad records *in any sport*.

    6. Re:Making sports bets by Mr.+Underbridge · · Score: 1

      Football is as random as basketball (or any other sport). You can win by 6 points as easily as lose by 6 points *if* the two teams are equally matched (on that given day).

      Statistically that statement is invalid. You can't determine randomness from a single trial, and the fact is that in football, the better team wins far more likely than sports such as baseball. Good teams in football can have a winning percentage of over 0.800, which is far better than the best baseball teams. So the outcomes of American football are much less random than baseball. Yes the outcome in football *can* swing wildly, but the fact is it tends not to as often as in baseball, when a last place team can often beat a first place team 5-1 or something. Looking at it historically, that doesn't happen *as often* in football.

      NBA Basketball is also fairly un-random. College is far more difficult to evaluate because of the extreme discrepancy between the major programs and the also-rans in Division 1. I have a fair amount of college basketball data on hand, perhaps I'll compare it to NFL games sometime.

    7. Re:Making sports bets by Anonymous Coward · · Score: 0

      Spoken like somebody who doesn't actually watch basketball. The scores in basketball are routinely far, far away from what's expected. The reason being that while there's a lot of scoring in basketball, there isn't enough for the law of large numbers to take hold in any serious way.

      It's common while watching a game to find yourself saying "Team A is playing much better than Team B" even when Team B is winning, simply because Team B has run well on low percentage chances.

      This is why the NBA does finals as a best-of-7 elimination competition, in order to increase the chances that the best team will actually win the final.

    8. Re:Making sports bets by Anonymous Coward · · Score: 0

      And of the things you've mentioned, weather is the only difference between basketball and football. Everything else is the same You seem to have missed that basketball is a non-contact sport...

  4. My NCAA predictor code had the same result! by doxology · · Score: 2, Funny

    Here's the code I used

    List pickFinalFour(Tournament tourney){
          List finalFour = new ArrayList();
          for (Division d : tourney){
                Team bestTeam = null;
                int minSeed = Integer.MAX_VALUE;
                for (Team t : d){
                      if (t.getSeed()minSeed){
                            minSeed = t.getSeed();
                            minSeed = team;
                      }
                }
                finalFour.add(bestTeam);
          }
          return finalFour;
    }

    --
    sigfault. core dumped.
    1. Re:My NCAA predictor code had the same result! by Anonymous Coward · · Score: 3, Funny

      Really? Here's mine:

      pick :: Ord a => [[a]] -> [a]
      pick = map minimum

  5. Why not test it for the past 10 years? by Anonymous Coward · · Score: 1, Interesting

    That was my first thought as well. The four #1 seeds are theoretically the most likely to be in the final four, assuming they were seeded correctly, but of course unexpected things usually happen in sports so this is the first time that's occurred. But if I had to bet my life on picking the final four, I'd probably pick the four #1 seeds in any given year because even though the odds of that occuring are low, the odds of me choosing which #2 or #4 seeds displace a couple #1s as usually happens so that I pick the correct four teams are probably lower!

    I'd say if these guys think their computer system is so good at making bets, can't they plug in data for the past 10 years worth of NCAA tournaments and see how well it does there?

    Or better yet, don't write an academic paper on it, put up their own money, win millions over the next few years beating Vegas, then tell us about it in a press release from the Carribean island they bought with their winnings! You have $50 million in winnings to back you up and I'm a lot more likely to believe you've made a major advance!

    1. Re:Why not test it for the past 10 years? by Anonymous Coward · · Score: 2, Insightful

      Why would 10 years be so much better than the 9 years they analyzed?

    2. Re:Why not test it for the past 10 years? by SerpentMage · · Score: 1

      I read their paper

      www2.isye.gatech.edu/people/faculty/Joel_Sokol/ncaa.pdf

      And what bothers me are things like the following:

      "For the LRMC model, we used all of the game data (home team, visiting team,
      margin of victory) from the beginning of the season until just before the start of the
      tournament; we obtained this data, as well as tournament results, on line from Yahoo!
      daily scoreboards [24]. We note that neutral-site non-tournament games were unknown
      in our data set; the team listed as âoehomeâ on the scoreboard was considered the home
      team in our data"

      What they did is dig through the past and see how it does in the future. I know from my own financial clients the problem is that you will always find winning systems. That's actually pretty easy. The problem is that can they actually predict. And most of the time the answer is NO.

      And like the poster said, "if this works so well why are they not betting?" Betting and the stock market are two places where it makes more sense to use your "system" than to talk about your system. You can literally make millions using your system. Yet here they are writing a paper...

      --

      "You can't make a race horse of a pig"
      "No," said Samuel, "but you can make very fast pig"
  6. Best bet is not to bet... by Bazman · · Score: 4, Interesting

    One of our research assistants started doing something like this about ten years ago, fitting a statistical model to previous soccer match results and the home/away effect. He rounded some of us up to chip in a few pounds each week and off he went to the bookies to bet on the outcome of his model.

    Now, any statistical model (such as this LRMC thing, or the techniques m'colleague used) will only give estimates of the odds. It might say that the probability of team A winning is 0.6. Now, if the bookies are offering you a return of 0.7 then it's worth a bet. If the bookies rate it 50-50 then it's not worth a bet.

      The trouble is that any statistical model worth its salt is going to produce probabilities that add up to 1.0, whereas the bookies' odds can add up to 1.2 or so. That's how they play the game and make their profits.

      So after a season where we made a few pennies profit, and got some press interest (including a team from BBC Tomorrow's World filming us playing football), my friend realised the best thing to do was not to bet at all.

      And instead he went into the business of supplying odds to bookmakers. From where he now sits at the top of a rather large business empire!

      I might pop him an email to see what his current techniques are, but back in the day it was something similar to this LRMC thing.

    1. Re:Best bet is not to bet... by dorpus · · Score: 1

      Are you mixing up probability and odds?

      Probabilities of all possible outcomes will add up to 1, but odds are p/(1-p), where p = probability of a given event. Odds can vary between 0 and infinity.

      Logistic regression predicts the log-odds of a given event (which can be exponentiated to predict odds, or converted to a probability.)

    2. Re:Best bet is not to bet... by Fizzl · · Score: 1, Funny

      One of our research assistants started doing something like this about ten years ago

      Jesus H Christ it must suck to be a scientist. Imagine working 10 years at one place and still be a mere "assistant".
    3. Re:Best bet is not to bet... by azgard · · Score: 1

      Exactly! I came to the similar conclusion (but theoretically, without computing the probabilities), when a friend, also a student of mathematics, came to me with similar idea. We then checked the bookmakers' odds and they all have this property (inverses add up to more than 1). There is nothing more to add to your post really, except maybe that bookmakers can also add any amount of uncertainty (coming from the statistical model of the data) into their odds (by making it more than one by higher or lesser margin), so they are completely immune to any loss.

      It's sad that most people don't realize that bookmarking is like roulette - you will lose on average no matter how good (statistical) information about the winner you have.

    4. Re:Best bet is not to bet... by Squalish · · Score: 1

      Not at all, he's saying that a bookie always makes his cut. He's saying that if you bet based on probability (if team A is 10%, you bet 10% of your money on him, and 90% on the other guy) on both sports teams in a head-to-head contest with no chance of a tie, it will cost you about 20% of your bet. With such a heavy cut (common in risky black markets), even a highly effective predictive algorithm can lose money.

      --
      People in Soviet Russia, however, appear to be afflicted with amusing juxtapositions of the aforementioned situation
    5. Re:Best bet is not to bet... by maxwell+demon · · Score: 1

      It's sad that most people don't realize that bookmarking is like roulette

      You mean because you never know whether the link will have expired when you next open the bookmark? :-)
      --
      The Tao of math: The numbers you can count are not the real numbers.
    6. Re:Best bet is not to bet... by Bazman · · Score: 1

      I'm mixing up the terminology perhaps, but only because people are used to getting 'odds' from a bookie expressed as 'X to Y'. And in stupid units too (mathematically speaking). "100-30"? "6-4 on"? Jeez they're not even in their lowest terms! No wonder mathematical numeracy is declining!

      All bookies odds can be converted to a probability between 0 and 1, and it makes it easier to see if the probabilities do add up to more than 1 (and also if 100-30 is better than 6-4).

      Of course some would argue (and this being slashdot, some will) that the real reason for the decline in numeracy is because we no longer have to work out weights in pounds and ounces, or distances in feet and miles, or money in pounds shillings and pence. Err yeah maybe I dunno. Discuss.

    7. Re:Best bet is not to bet... by ZombieWomble · · Score: 1

      It's sad that most people don't realize that bookmarking is like roulette - you will lose on average no matter how good (statistical) information about the winner you have. This isn't strictly true - by definition, if you have perfect statistical information, you win every bet and cannot possibly lose on average.

      Extending this down to worse statistics, to win in the long run all you need to do is have sufficiently better information than the bookie to ensure that you can overcome the extra padding they give to their chosen set of odds, which is not impossible in principle.

      Of course, doing such a thing in practice is an entirely different kettle of fish, which is why it's still much better to be the bookie, but that glimmer of hope is why people keep playing.

    8. Re:Best bet is not to bet... by dstates · · Score: 1

      Read the original Baum and Welch paper. Ever wonder why "Baum, Gaines, Petrie and Simons "Probabilistic models for stock market behavior. To appear." never appeared. Check out James Simons.

      --
      Statesman
    9. Re:Best bet is not to bet... by typicallyterrific · · Score: 1

      Of course some would argue (and this being slashdot, some will) that the real reason for the decline in numeracy is because we no longer have to work out weights in pounds and ounces, or distances in feet and miles, or money in pounds shillings and pence. Err yeah maybe I dunno. Discuss.


      I'll bite :P.
      Yes, 'cos lord knows that all those people in other countries stuck on metric for the past hundred years just cannot do math.
      Damn Anglos.
    10. Re:Best bet is not to bet... by slimjim8094 · · Score: 1

      my friend realised the best thing to do was not to bet at all.

      Have you been watching WarGames?
      --
      I have developed a truly marvelous proof of this comment, which this signature is too narrow to contain.
    11. Re:Best bet is not to bet... by SerpentMage · · Score: 1

      Did you also read that the fund is down....

      James Simons's $29 billion Renaissance Institutional Equities Fund has fallen 8.7 percent so far in August when his computer models used to buy and sell stocks were overwhelmed by securities' price swings. The two-year-old quantitative, or 'quant,' hedge fund now has declined 7.4 percent for the year. Simons said other hedge funds have been forced to sell positions, short-circuiting statistical models based on the relationships among securities."

      BTW I use Quant methods as well, and I am as of last Friday up 5.4% on the year! With some stock picks being up a whopping 26%! (Apple was one my picks. I got in at 119)

      The point is that quant methods ALWAYS break down. In fact if anything has been proven by this stock market whipping is that quants are just as much "idiots" as regular folks. And that is actually a good thing.

      --

      "You can't make a race horse of a pig"
      "No," said Samuel, "but you can make very fast pig"
  7. only one question by petes_PoV · · Score: 0, Troll

    who's going to win the 'National today? If it can't tell me that, then no matter how technical-sounding it's algorithm is, it's not a lot of use to me.

    --
    politicians are like babies' nappies: they should both be changed regularly and for the same reasons
    1. Re:only one question by TheCreeep · · Score: 2, Funny

      That's ok, becase I don't think that they created the algorithm with you in mind. You're just a negligible quantity.

    2. Re:only one question by Simian+Road · · Score: 1

      Simon.

    3. Re:only one question by Simian+Road · · Score: 1

      Oh well

    4. Re:only one question by hobbit · · Score: 1

      I predict it will be "Comply or Die".

      --
      "Wise men talk because they have something to say; fools, because they have to say something" - Plato
  8. Excellent... by dandenoth · · Score: 1

    So I need a copy, preferably of the source, and a bookie.

  9. how does it compare to a playstation? by IKILLEDTROTSKY · · Score: 1

    3 years ago a friend of mine ran the super bowl through a football game and ended up 2 points off, does anyone know what the accuracy of those games are compared to a "real" system like this?

  10. Wait a minute! by ydra2 · · Score: 2, Funny

    Are you telling me that somebody actually looked at win/loss records and margin of victory and strength of opponents to figure out which team might win? How can this be? Why did nobody ever figure out this simple algorithm before? [slaps forehead with hand] DOH!

    Oh wait, sorry it was patented years ago, and multiple times with minute variations such as going back to strength of opponents opponents, and margin of victory of opponents against common opponents, and strength of opponents opponents opponents, and ....

    But if you add in what they ate for breakfast, then you might have a new patentable algorithm.

    1. Re:Wait a minute! by epine · · Score: 1

      This particular strain of nihilism gets on my nerves after a while. There is no clear inference from "it's been tried before". Not even if you multiply by a million times, or a million wannabe losers all with the same wannabe dream.

      When substantial progress is made on a long-standing problem, generally there are three situations: the new approach was never tried before (because all the losers were looking under the same wrong rock), or the approach requires deep theoretical insight and skill (which losers rarely possess), or it was tried a million times already, but even so, none of the losers managed to do it quite right, until now.

      All three categories are well represented. By entirely blotting out the "it's been tried before" category on general principle (if a sneer can be referred to as a principle), one wipes out a substantial chunk of the pie of new results worth knowing.

      I guess some of us assign a low weight to mistakenly discarding genius, and place a higher priority on correctly labeling losers, which is the only thing the "it's been tried before" inference is any good at.

      To sift out the rare occasion of genius you actually have to RTFA and evaluate on merit. You can't pass judgments based on background noise such as "it's been tried before" with a broad sweep of the hand toward the loser parade.

      Reminds me of a book reviewed the paper I read at the coffee shop today:

      Kluge: The Haphazard Construction of the Human Mind

      I noted that the illustration of the human mind in that newspaper review left out the center for "having to pee" which I suspect is extensively shared with the center for heckling other primates on slim cause.

      I wonder if that book is any good.

    2. Re:Wait a minute! by Anonymous Coward · · Score: 0

      And your point is what exactly ?

  11. Now Hear This! by IHC+Navistar · · Score: 1

    I will now be taking bets on how long before mob goons put an axe through the computer, tie it to a chair, and throw it into a river.....

    --
    Knowing Google's lust for data collection, the Soviet Union is still alive and well inside the psyche of Sergey Brin....
    1. Re:Now Hear This! by maxwell+demon · · Score: 2, Funny

      "We want this machine off, and we want it off now!"

      But I can predict which team the machine will predict to win: Team #42

      --
      The Tao of math: The numbers you can count are not the real numbers.
  12. I'm not convinced by drsquare · · Score: 3, Insightful

    If I had a computer that could predict sports results, I wouldn't tell anyone about it. I'd take a briefcase full of cash down to the bookmakers.

    1. Re:I'm not convinced by AncientPC · · Score: 1

      Bookies (online ones at least) will blackball you if you win too much of the time.

  13. RTFA by sarahbau · · Score: 5, Informative

    I know this is Slashdot, but why can't people RTFA before commenting? They aren't using the seeds or rankings in the program - only game stats, home quart advantage, etc. They ran it on the last 9 years of data and it picked final four teams 30% more often than analysts. (30/36 vs 23/36).

    The linked article didn't mention it, but from the GA Tech web site, it said that it correctly identified several overrated teams that lost early on (like Georgetown), and underrated teams that went farther than expected (like WVU). The program picks Kansas to win this year.

    1. Re:RTFA by sarahbau · · Score: 1

      lol. home quart advantage. Sorry. I should have converted that to liters before posting.

    2. Re:RTFA by KevinIsOwn · · Score: 1

      The summary could at least link to the paper. But then again, if we can't expect people to RTFA, I highly doubt anyone is going to RTFP...

    3. Re:RTFA by Anonymous Coward · · Score: 0

      Well, as long as it picked Kansas. If it had picked UNC, then we would have known for sure it was borken.

    4. Re:RTFA by Anonymous Coward · · Score: 0

      "home quart advantage"

      must be a gallon of that :-)

      Home Court Advantage is ALWAYS an adjustable parameter whose only justification for existance is that twidding it produces the 'right results'.

      See Wikipedia - Numerology for details.

  14. Data mining by 26199 · · Score: 2, Insightful

    Doesn't say whether the test was done on in-sample or out-of-sample data. That is, did they test using the same data that was used during development?

    If so, the results are worthless. You can make a "system" that says anything you want given enough tweaking. (This is often the problem with apparently successful computer trading models).

    1. Re:Data mining by shis-ka-bob · · Score: 1

      Captain Obvious, is that you? Isn't this one of the first lessons in the second year class for undergrad statistics? I hope we can assume that PhD statisticians are not going to use in-sample data and call it a 'prediction'. That's like 'predicting' last week's score.

      --
      Think global, act loco
    2. Re:Data mining by epine · · Score: 1
      If the conceptual approach is sufficiently restrictive (extreme paucity of tunable parameters), it still amounts to something to successfully predict in-sample data.

      What I was more concerned about is whether the prediction task they've taken on has low intrinsic difficulty. The fact that others have done it badly doesn't prove much. Worse, those other predictions might have been made with a different immediate purpose, for which they were closer to optimal than as interpreted by this paper for the prediction this paper chose to take on.

      It's still worth publishing prospective results in the situation where the available sample data is insufficient to partition into train/test subsets.

      If the method scores zero next year, or substantially underperforms chance over the next few years, they'll end up looking fairly foolish.

      Roughly for this kind of result I would say 25% it proves worthless, 25% chance it softens toward the mean, and 50% chance it continues to perform as well as advertised.

      Of the 50% chance it holds up, there is about a 50% chance that the predictive power pertains to some common sense term that other people have handled improperly or neglected, and only a 50% chance that the LRMC framework was instrumental to its predictive success.

      About that comment by some guy that in the NFL 0.800 percentages are more common than in MLB. Doh!

      I googled "nfl season length" and the second link comes up with this paper:

      http://cnls.lanl.gov/~ebn/pubs/sports/html/

      The length of the season is a significant factor in the variability in the winning fraction. In a scenario where the outcome of a game is completely random, the total number of wins performs a simple random walk, and the standard deviation $\sigma$ is inversely proportional to the square root of the number of games played. Generally, the shorter the season, the larger $\sigma$. Thus, the small number of games is partially responsible for the large variability observed in the NFL. This season, the Ottawa Senators had a 15-2 start (about the length of an NFL/CFL season), but barely squeaked into the playoffs after one of the great collapses in pro sports.

      That was an 0.880 winning percentage (ignoring any infamous loser points the NHL now awards) over 17 games to start the season. Clearly the Senators were a shoo-in to win the Super Bowl. Bonus: Ray Emery would fit right in among professional place kickers.
  15. Great sample by Idiomatick · · Score: 4, Insightful

    Great sample... They should test the algorithm on maybe 80 historical seasons and maybe we will be able to see something.

    1. Re:Great sample by xLittleP · · Score: 1

      Great sample... They should test the algorithm on maybe 80 historical seasons and maybe we will be able to see something. Well, the problem with that is the NCAA tournament hasn't always used seeds, and only opened up to 64 teams in 1985. So you might want to see results from the past 23 seasons. That's why I don't trust most of the stats analysts provide unless they preface it with, "since the tournament expanded to 64 teams..." For example, UCLA has eleven or twelve national championships but most of those where back when you only had to win two or three games to claim the title. Six games is a marathon.
      --
      When is Slashdot going to add a -1 moderation option for people who actually RTFA?
  16. Welcome the Overlord by BountyX · · Score: 0, Redundant

    I, for one, welcome our gambling overlords.

    --
    Trying to install linux on my microwave, but keep getting a kernel panic...
  17. I'm using it by jedijacket · · Score: 2, Interesting

    I heard about this last year and used their picks for this year's bracket. I'm tied for first in my pool, and 93.5% nationally in espn's bracket game. Just for comparison of how good their choices are. They had 100% on the first round day one.

    1. Re:I'm using it by jedijacket · · Score: 1

      They picked yesterday's games right too. So that makes their picks better than *99.1%* better than others in ESPN's pool, based on that being my percentage in ESPN's pool and I played the GT researchers' picks.

  18. Link to the paper by yo · · Score: 2, Informative

    Here is the paper describing the method: http://www2.isye.gatech.edu/people/faculty/Joel_Sokol/ncaa.pdf

  19. Money whoring... by DrYak · · Score: 1

    I hope we can assume that PhD statisticians are not going to use in-sample data


    Depends on what was that paper for.
    If the paper they published is to test and prove methods to produce good quality predictions, they'll probably use out-of-sample data.

    If the paper was published so they can ask grands, they'll probably use in-sample data and any other possible trick just to make look their system more efficient. Special bonus if they managed to cram a few money-producing grands like "could be used by DHS to predict potential terrorist threats", "can by applied by police to more accurately detect child pronographer", "has potential military applications" and whatever justification can have the "pirate" keyword attached.
    --
    "Sufficiently advanced satire is indistinguishable from reality." - [Tips: 1DrYakQDKCQ6y52z6QbnkxHXAocMZJE61o ]
  20. yadda-yadda by Anonymous Coward · · Score: 0

    "The system was tested on the 2008 US NCAA basketball season and picked all four of the finalists."

    On run 901, on run 666, on 10% of runs? Poppycock approach if that is all they can say about it. Read "A SEAGUL Visits the Race Track" for an example of a proper discussion of a prediction system.

  21. ivory tower cluelessness by Anonymous Coward · · Score: 0

    "The system was tested on the 2008 US NCAA basketball season and picked all four of the finalists."

    HAHAHA. Gamblers are not interested in the outcome of 8 games, gamblers are interested in the outcomes of 32 (round one) + 16 (round two) games, a useful return base. Waste of time/money after those rounds for gamblers. Fans are interested in such outcomes, and mainly tell you how they got cheated.

  22. But in the end... by CompMD · · Score: 1

    ...the program will have a special function designed to find something nasty to say about Kansas and the computer will begin making sounds like Dick Vitale on amphetamines screaming about North Carolina, and just for good measure, Duke, even though they aren't playing.

    Only then can it be true to life.

  23. Just great. by Rob+T+Firefly · · Score: 1

    Who taught Biff Tannen how to program?

  24. A few points from a professional by Anonymous Coward · · Score: 0

    1) Beating out the analysts on their predictions is no impressive feat.

    2) The title references that it makes the best "sports bets", but then only goes on to say it was good at predicting who would win the game. The "winner" of the game is not only who will cover the spread in the game -- and very often in the tournament (read: multiple times, every year) the team that is considered a worse seed is FAVORED over the team with a better seed.

    3) The professionals that consistently win in Vegas do so by using computers that crunch the numbers and give them a sharper # than what the oddsmakers have. But there is a lot more to it than that -- one of the major problems these professionals have is that it is impossible for them to get a significant amount of money down on these "loose" #'s before they are moved -- when you start doing this on a regular basis you find that getting the money down on the best # harder than picking the winner.

    4) What most sports gamblers never realize is its all math. It has very little to do with who you think will win, and a lot more to do with what spread you are getting. You've got certain players playing at places like Bodog, Sportsbook.com and other "square" books that have low limits and deal square lines -- those players will NEVER win significant amounts of money long-term. If you are consistently taking +4 in a basketball game while a "professional" is getting +4.5, you might as well just find another hobby.

    Eye On Gambling -- www.eog.com

  25. New Tests Will Show More by Anonymous Coward · · Score: 0

    Seeing as the final four were all the number one seeds in the Tournament the fact that the computer predicated this is not even that interesting. If it was a great algorithm/computer then the results would be in how well it picked both the highest ranked teams to win but also found the upsets. The bottom line is the computer isn't tested enough and they need to keep testing it and they will probably find that it is not very good just like all the other computer programs. My Personal Web Site