Slashdot Mirror


Mathematician Predicts Yankees To Dominate

anthemaniac writes "Computerized projections in sports are nothing new, but Bruce Bukiet of the New Jersey Institute of Technology has developed a model that seems to work pretty well. He projects how many games a Major League Baseball team will win by factoring in how each hitter ought to do against each pitcher in every game. His crystal ball says the Yankees will win 110 games this year, a pretty safe bet, many might agree. But he also projects all the divisional winners. He claims to be right more than wrong in five of the past six years."

28 of 170 comments (clear)

  1. 110 wins? by nebaz · · Score: 5, Insightful

    It's a safe bet that the Yankees will do well, they always seem to spend almost twice as much as most other teams on talent, not to mention luring good players from other teams away to crush competition. Having said that, they have always spent such money, and not done exceptionally well as of late. 110 wins is a lot, and not many tesms have accomplished that. Safe bet? Hardly.

    --
    Rhymes that keep their secrets will unfold behind the clouds.There upon the rainbow is the answer to a neverending story
    1. Re:110 wins? by sebi · · Score: 2, Interesting

      I agree that RS vs RA is a good way to predict the success of a team. It's not always so helpful looking back. The Indians scored 870 runs last season and only allowed 782. How did they do? Not so well: a 78-84 record, good enough to finish fourth in their division. How can one explain that disparity? Blowouts. Those 22-0 games that happen every once in a while. I like Runs Scored vs Runs Allowed models. Just not the ones that get updated during the season.

    2. Re:110 wins? by Rogerborg · · Score: 2, Funny

      Not safe at all, until you factor in whomever the Mob has their money riding on.

      --
      If you were blocking sigs, you wouldn't have to read this.
  2. If he's so confident... by The+Living+Fractal · · Score: 2, Interesting

    Has he put up beaucoup bucks in Vegas on his numbers? If not, why not. If so, how much did he win, and where can I get his numbers this year?

    TLF

    --
    I do not respond to cowards. Especially anonymous ones.
  3. I never understand these things... by krbvroc1 · · Score: 4, Informative

    Isn't here some rule or law about 'fitting a curve' to past data? Yet, the sports predictions, and many of the 'stock market systems' are all about
    finding some seemingly obvious pattern in past data. While you might come up with a 'back tested' model that matches really well,
    it doesn't mean squat for the future.

    1. Re:I never understand these things... by BridgeBum · · Score: 4, Informative

      His models have evolved over the years, but he tries to simulate actual games using both individual statistics (players batting averages, etc.) as well as team trends (how well does a player do against a specific pitcher). He uses a large Markov chain to predict state transitions (Runner on first, no outs - how often does it go to two outs? That sort of thing.) Very interesting project, it was a lot of fun to work on. (I was an undergrad working with Bruce 15 years ago, when he was first starting this project. He's kept it going for years.)

      --
      My UID is the product of 2 primes.
    2. Re:I never understand these things... by Burdell · · Score: 4, Insightful

      It is still trying to predict future results based on past performance. No matter what you predict, last year's Chipper Jones will never again face last year's Roger Clemens. Even if Clemens un-retires (again), he is not the same person, and neither is Chipper Jones. You also can't predict injuries, trades, managers' decisions, umpires' calls, weather, etc., all of which have an impact on the outcome of an individual game.

    3. Re:I never understand these things... by Anonymous Coward · · Score: 2, Insightful

      You're right. We should stop trying to predict anything because we won't ever be 100% correct.

  4. Huh? by Kuukai · · Score: 4, Insightful

    While Bukiet is the first to admit he's not a baseball expert, in five out of the past six years, he says that his model has produced more correct than incorrect predictions. What? Does this even mean anything? If, say, he was right 51% percent of the time five years and wrong 90% of the time that other year, wouldn't that make his number of successes less than the expected number of successes from just guessing "win" or "lose"? I guess he's either really modest ("I don't like to brag, so I'll just say the accuracy is higher than 42%."), or a really, really bad statician.
    --
    Sendou Wave Kick!!
    1. Re:Huh? by AstrumPreliator · · Score: 2, Informative

      ...or a really, really bad statician.

      Or a really good statistician. Remember, when you ask a statistician to crunch some numbers for you he'll reply back with "and what would you like the numbers to say?". They'll make it fit any curve you throw at them.

  5. Keeping up appearances by ScrewMaster · · Score: 4, Funny

    "Hello Mr. Bukiet"

    "It's pronounced bouquet!"

    --
    The higher the technology, the sharper that two-edged sword.
  6. amazing by flynt · · Score: 2, Insightful

    Wait, you mean you can use past data to try to predict future events under certain assumptions, and sometimes it works? Someone should generalize this into some sort of academic discipline!

    1. Re:amazing by ScrewMaster · · Score: 5, Funny

      They did. It's called "tenure".

      --
      The higher the technology, the sharper that two-edged sword.
  7. Re:Claims to be right more than wrong, heh? by BridgeBum · · Score: 3, Informative

    Bruce is actually a die hard Mets fan. I helped work on this project with him back in my undergrad days 15 years ago or so. I doubt any of my code is still be used though. :-)

    --
    My UID is the product of 2 primes.
  8. But... Yankees Suck!! by Jon_S · · Score: 3, Funny

    signed,

    Red Sox fan

  9. That's nothing... by ericpi · · Score: 5, Funny

    He claims to be right more than wrong in five of the past six years.

    That's nothing: I've devloped a new mathematical algorithm that correctly predicts the outcome of the past six years with 100% accuracy.

  10. He's been way off-the-mark for years... by Golgafrinchan · · Score: 4, Interesting
    First, a link to the professor's baseball page.

    In 2006, he predicted 102 Yankee wins. They won 97. Not too bad.

    In 2005, he predicted 113 Yankee wins. They won 95. Way off.

    In 2004, he predicted 117 Yankee wins. They won 101. Way off.

    In 2003, he predicted 110 Yankee wins. They won 101. Not great.

    In other words, take this forecast with a big boulder of salt.

    --
    My userid is prime!
    1. Re:He's been way off-the-mark for years... by njchick · · Score: 3, Funny

      I would say 1.0*10^2 wins.

  11. Big Whup... by Anonymous Coward · · Score: 2, Informative

    Bill James came up with simple quantifiable statistics that could very accurately predict the success rate for a baseball team back in the '70s. The Oakland A's had a lot of success using those methods to put teams out of the field that would win between 95-100 games per year while spending as little as possible. It worked remarkably well and a book (Moneyball, by Michael Lewis) was written about it.

    In short, this is old and well covered news, unless this guy has come up with a simulation that is significantly more accurate (doubtful).

  12. A Much Safer Bet... by Black-Man · · Score: 2, Funny

    The Pirates - 2nd lowest payroll - will suck again. 14 losing seasons in a row. I give it a 99.9% certainty they make it 15. I'm not even a MIT grad!

  13. Climate Models? by Matteo522 · · Score: 5, Insightful

    So let me get this straight..

    Climatologists use past data, computer models, and mathematical projections to support global warming and predict future results, and everyone calls it strong science based on facts. If the models are off, it's just a part of the scientific process, but the overall claim is still valid.

    But if a statistician uses past data, computer models, and mathematical projections to predict baseball results, it's dismissed as some crack job's phony science. If the models are off, it's proof that he has no idea what he's doing and how these kinds of models don't work.

    Am I missing something here?

    1. Re:Climate Models? by zippthorne · · Score: 2, Insightful

      Yes, In the public experience, most fancy sports predictions have a history of being inaccurate. This is unlike the experience with climate models, which historically have also given us some predictions.

      --
      Can you be Even More Awesome?!
    2. Re:Climate Models? by Ibag · · Score: 2, Insightful

      What you are missing is that not all models are created equal, and not all things are as easy to model. It's all about variance. Consider the weather, for example. We can accurately predict what it will be for a day or two, and we have a decent guess for about a week, but beyond that, there is too much complexity and variability for us to say much (not to mention that weather appears to be a dynamical system, i.e., an example of chaos theory, which means that prediction is theoretically impossible). However, if I were to ask you what kind of weather I could expect this July, you could make some fairly accurate guess of "warm". All the small scale variations cancel out, and you can have a very good prediction of what the average temperature, or average rainfall, or average anything else will be over the next year, or 10.

      For long term climate, we have a good idea how many of the processes involved work, and we can vary all the parameters to give ranges on the possible outcomes. While we can't use them to predict the rainfall in Boston on July 4, 2057, we can use them to say that the mean global temperature will be 3-5 degrees warmer that year (or some other similar statement).

      Compare this to baseball. There aren't enough interactions for small variations not to throw everything off. Things like injuries, marital problems, drugs, rivalries, and weather could shift the outcomes of major games in ways and change the outcome in this model more severely than China switching to nuclear power would do in climate models. There is a better chance at predicting total numbers of runs or hits during the season, as the variation on things like that is smaller. Predicting the number of games won is almost as hopeless as predicting the outcome of an individual game, and if you could do that, you could hire people to post to slashdot for you.

  14. Baseball is easy to predict by obdulio1950 · · Score: 2, Informative

    Nobody could predict this one: http://www.planetworldcup.com/CUPS/1950/wc50index. html and the "Macacos" still cry about this......

    --
    PEÃ'AROL: SerÃs eterno como el tiempo y floreceras en cada primavera
  15. From one of his students by kenb215 · · Score: 5, Informative

    Wow, I never expected somebody that I knew to get on Slashdot. Bruce Bukiet is my Calculus II professor at NJIT.

    He mentioned this before a few times, including today after that article made it to the most popular spot on Yahoo! News. This is more of a hobby for him than an official project.

    From what he has said in the past about the model, it tends to overestimate the Yankees, among other reasons, because they often buy good players at the end of their prime. Thus the players won't play as well as they had in the past. He hasn't used it to make any bets. For the model, coming within a game or two of the actual results is considered a good prediction.

    As some people above said, the model isn't intended to be extremely accurate, and is frequently off by a significant amount. The interviews he does are more to get people interested in math, and to see how it has real use, rather than to try and show off. He used to go into more details in the past, but doesn't now because they tend to confuse the interviewer, and don't make it into the final article.

    Some pages of his own about the project are:
    http://m.njit.edu/~bukiet/baseball/baseball.html
    http://www.egrandslam.com/
  16. He left out several important variables by PFritz21 · · Score: 2, Interesting

    Injuries. Did he take these into account? A lot of good teams have had lousy seasons due to players being hurt for long periods of time. MAYBE if every member of every team was able to play a full schedule of 162 games...

    Performances. If every player played consistently every day, but some guys go on hot streaks and get moved up in the batting order. Some guys go cold and get bumped down, or even worse, sent to the minors. MAYBE if the 25-man rosters stayed constant for the entire season.

    Luck. Three teams each score 750 runs over the course of a season. Each one also allows 750 runs. http://en.wikipedia.org/wiki/Pythagorean_expectati onBill James' Pythagorean expectation says that each team should play .500 ball; 81 wins and 81 losses. But one team could win a lot of close games and lose a couple dozen blowouts, finish with 90+ wins. Another could lose a bunch of close games and win a couple dozen blowouts, ending up with only 70 wins.

  17. Re:Bah by koreaman · · Score: 2, Insightful

    Generally one needs a Ph.D in math to be a "mathematician".

  18. Re:Red Sox suck!! by zero1101 · · Score: 2, Insightful

    I got news for you both. The Yankees AND the Red Sox suck. Put 'em both in the AL Central, and they're fighting for third place tops. On what planet? Granted the Red Sox did poorly against the AL Central in 2006 (15-19), but the Yankees were 23-12 against the Central.

    For the last 3 years, the Yankees are 61-37 against the AL Central as a whole, and the Sox are 56-45. For those years, the standings of the top 4 teams from the East and Central are as follows:
    2006:
    NYY 97-65
    MIN 96-66
    DET 95-67
    CWS 90-72
    2005:
    CWS 99-63
    NYY 95-67
    BOS 95-67
    CLE 93-69
    2004:
    NYY 101-61
    BOS 98-64
    MIN 92-70
    CWS 83-79

    Only last year would even one of those two teams not have ended up in a MINIMUM of third place, and the Yankees would still have been firmly on top. And frankly, a lot of the stars had to align for the standings to end up so well in the Central's favor last year. If you base your argument SOLELY on the 2006 results, and completely ignore any other factors, you might be able to make half a case, but it would be a weak one.