Mathematician Predicts Yankees To Dominate
anthemaniac writes "Computerized projections in sports are nothing new, but Bruce Bukiet of the New Jersey Institute of Technology has developed a model that seems to work pretty well. He projects how many games a Major League Baseball team will win by factoring in how each hitter ought to do against each pitcher in every game. His crystal ball says the Yankees will win 110 games this year, a pretty safe bet, many might agree. But he also projects all the divisional winners. He claims to be right more than wrong in five of the past six years."
It's a safe bet that the Yankees will do well, they always seem to spend almost twice as much as most other teams on talent, not to mention luring good players from other teams away to crush competition. Having said that, they have always spent such money, and not done exceptionally well as of late. 110 wins is a lot, and not many tesms have accomplished that. Safe bet? Hardly.
Rhymes that keep their secrets will unfold behind the clouds.There upon the rainbow is the answer to a neverending story
Whoopty fsck. So's RailGunner. Runs are fun to watch, but pitching is what wins. And the Yanks have? Anyone? Anyone at all? Yep. They got nothin' at pitcher.
If brevity is the soul of wit, then how does one explain Twitter?
Has he put up beaucoup bucks in Vegas on his numbers? If not, why not. If so, how much did he win, and where can I get his numbers this year?
TLF
I do not respond to cowards. Especially anonymous ones.
Isn't here some rule or law about 'fitting a curve' to past data? Yet, the sports predictions, and many of the 'stock market systems' are all about
finding some seemingly obvious pattern in past data. While you might come up with a 'back tested' model that matches really well,
it doesn't mean squat for the future.
The best way to test any model is to start with the end points. How low does it score the New York Mets?
Demented But Determined.
Sendou Wave Kick!!
"Hello Mr. Bukiet"
"It's pronounced bouquet!"
The higher the technology, the sharper that two-edged sword.
Wait, you mean you can use past data to try to predict future events under certain assumptions, and sometimes it works? Someone should generalize this into some sort of academic discipline!
Bruce is actually a die hard Mets fan. I helped work on this project with him back in my undergrad days 15 years ago or so. I doubt any of my code is still be used though. :-)
My UID is the product of 2 primes.
It was called Strat-O-Matic Baseball, and many a night in the hills of Worcester I had to fall asleep to the constant clinkity-clink-clink-clinkle of a pair of dice in a stolen cafeteria coffee cup.
"Win treats sysadmins better than users. Mac treats users better than sysadmins. Linux treats everyone like sysadmins."
signed,
Red Sox fan
The article says he has made more correct than incorrect predictions in his several years of doing this.
Something tells me that when he predicts that the Yankees will win 110 games, for example, he is counting his prediction as fulfilled if the Yankees win AT LEAST 110 games.
Because it would be pretty remarkable if he has correctly predicated the EXACT number of games teams will win more than incorrectly over the past several years.
And since no margin of error is provided, there's really no basis for saying whether his model is impressive or not. Probably not.
He claims to be right more than wrong in five of the past six years.
That's nothing: I've devloped a new mathematical algorithm that correctly predicts the outcome of the past six years with 100% accuracy.
The Yankees have weak-ass pitching this year. No chance they win 110 games. More likely 90.
SpyDock: Scientific Python in a Docker container
Don't Yankees fans predict they will dominate every year? That being said, I never take predictions like this seriously, especially if it is another "Yankees will pwn" claim. Odd, however, that I didn't see anyone predict what the 2001 Seattle Mariners did (116 wins).
Oh, and yes, I am a mathematician (will obtain BA degree in math this June).
Calling atheism and agnosticism a religion is like calling bald a hair color.
In 2006, he predicted 102 Yankee wins. They won 97. Not too bad.
In 2005, he predicted 113 Yankee wins. They won 95. Way off.
In 2004, he predicted 117 Yankee wins. They won 101. Way off.
In 2003, he predicted 110 Yankee wins. They won 101. Not great.
In other words, take this forecast with a big boulder of salt.
My userid is prime!
Bill James came up with simple quantifiable statistics that could very accurately predict the success rate for a baseball team back in the '70s. The Oakland A's had a lot of success using those methods to put teams out of the field that would win between 95-100 games per year while spending as little as possible. It worked remarkably well and a book (Moneyball, by Michael Lewis) was written about it.
In short, this is old and well covered news, unless this guy has come up with a simulation that is significantly more accurate (doubtful).
Bruce is actually a die hard Mets fan. I helped work on this project with him...
does he account for beltran removing the bat from his shoulder or just watching strike 3? and if so, constant or variable?
The IRS is the one organization that you don't want to fuck with. Remember, these are the guys who took down Al Capone.
The Pirates - 2nd lowest payroll - will suck again. 14 losing seasons in a row. I give it a 99.9% certainty they make it 15. I'm not even a MIT grad!
easier than predicting the future.
He modeled his program on the past 5-6 years data thats why: "He claims to be right more than wrong in five of the past six years."
How does he factor rookies? Does he model injuries and use the data to rank teams susceptibility to lost talent?
Unless this program is 6 years old his model is only back-tested; not proven.
"Accountant predicts Yankees will dominate based on salary spending."
"Sports historian predicts Yankees will dominate based on past seasons."
"Incoherent drunk predicts Yankees will dominate based on voices in his head telling him so."
"Everyone who's even remotely familiar with MLB dies of a massive simultaneous aneurysm trying to comprehend why anyone predicting the Yankees will be one of the top teams in the league for any reason at all qualifies as "news" rather than statement of the obvious."
Seriously, I'm from Massachusetts and detest the Yankees, and I still have to acknowledge that even if the Yankees are "having a bad season", they're still one of the best teams in the league.
Try not to take me more seriously than I take myself.
I want to know how he calculated Daisuke Matsuzaka's numbers since he's never played ball in the states. Theoretically he should dominate the AL given his performance in Japan but those numbers don't mean much when considering the power hitters in the AL, much less MLB. Here's hoping Bukiet is wrong though. I'd love to see the Yankees tank and not make the play-offs but I'm a Red Sox fan and I always hope that happens.
So let me get this straight..
Climatologists use past data, computer models, and mathematical projections to support global warming and predict future results, and everyone calls it strong science based on facts. If the models are off, it's just a part of the scientific process, but the overall claim is still valid.
But if a statistician uses past data, computer models, and mathematical projections to predict baseball results, it's dismissed as some crack job's phony science. If the models are off, it's proof that he has no idea what he's doing and how these kinds of models don't work.
Am I missing something here?
FTA: "Were the model to be commercialized, it could be updated on a play-by-play basis, which fans could monitor to see how every play changes the outcome of a game. "I think some fans would think that's cool," Bukiet said."
How individual plays affect the outcome (or probable outcome) has been a well-worn subject of late in the blogs and discussion lists of baseball fans. And you don't need commercial products for answers. Retrosheet.org provides play-by-play data reaching back decades, from which I calculated how often given game-states have resulted in wins for the home team. Taking the win expectancies before and after an event tells you how important the event was. My Win Expectancy Finder is lives here.
I imagine this guy's using Markov chains, too.
Signed,
Yankees fan
PS Have fun blowing up more innocuous devices because you think they're bombs
The Doormat
If you're not outraged, then you're not paying attention.
Nobody could predict this one: http://www.planetworldcup.com/CUPS/1950/wc50index. html
and the "Macacos" still cry about this......
PEÃ'AROL: SerÃs eterno como el tiempo y floreceras en cada primavera
Wow, I never expected somebody that I knew to get on Slashdot. Bruce Bukiet is my Calculus II professor at NJIT.
He mentioned this before a few times, including today after that article made it to the most popular spot on Yahoo! News. This is more of a hobby for him than an official project.
From what he has said in the past about the model, it tends to overestimate the Yankees, among other reasons, because they often buy good players at the end of their prime. Thus the players won't play as well as they had in the past. He hasn't used it to make any bets. For the model, coming within a game or two of the actual results is considered a good prediction.
As some people above said, the model isn't intended to be extremely accurate, and is frequently off by a significant amount. The interviews he does are more to get people interested in math, and to see how it has real use, rather than to try and show off. He used to go into more details in the past, but doesn't now because they tend to confuse the interviewer, and don't make it into the final article.
Some pages of his own about the project are:http://m.njit.edu/~bukiet/baseball/baseball.html
http://www.egrandslam.com/
$50? coward, I'll see you $100,000. No way in hell the Yanks win 100 games let alone exactly 110.
Well actually if you predicted the Yankees to win the series every year from 1903 to 2006 you'd only have a .257 success rate. On the other hand that's a plurality, and more than double the wins of the next best thing, the Cardinals.
Their 2007 Yankees projections:
PECOTA: 93
Diamond Mind: 96
This sounds like a good idea but you are gonna go crazy just like Maximillian Cohen trying to predict life. You cant predict a player going on the injured list like you can calculate RBIs. It is illogical to use something like this in a chaos filled world. For all you know the whole Yankee's team can be thrown out for illegal sports betting. It is also wrong because you forgot about the Detroit Tigers.
a little like saying the Cubs won't win?
What?
Not a safe bet at all. Especially considering that the AL East is fairly strong this year. It seems april fools day comes 4 days late for baseball fans... The prediction is a joke. While math can certainly be applied to predict things like this, it fails to take into account that yankees overspend on old players. A more accurate prediction would be that by the end of the season, the cumulative number of years that yankees players are past their primes is about 110.
Injuries. Did he take these into account? A lot of good teams have had lousy seasons due to players being hurt for long periods of time. MAYBE if every member of every team was able to play a full schedule of 162 games...
i onBill James' Pythagorean expectation says that each team should play .500 ball; 81 wins and 81 losses. But one team could win a lot of close games and lose a couple dozen blowouts, finish with 90+ wins. Another could lose a bunch of close games and win a couple dozen blowouts, ending up with only 70 wins.
Performances. If every player played consistently every day, but some guys go on hot streaks and get moved up in the batting order. Some guys go cold and get bumped down, or even worse, sent to the minors. MAYBE if the 25-man rosters stayed constant for the entire season.
Luck. Three teams each score 750 runs over the course of a season. Each one also allows 750 runs. http://en.wikipedia.org/wiki/Pythagorean_expectat
If you had done this this millennium you'd have struck out a lot. Like all the time. Let's face it. The time of total dominance by one team is over. Wild card and luxury tax seem to be doing what they're supposed to. The last six world series were won by six different teams. Of course that won't get my team any closer to a championship, but all Cubs fans agree: If we don't manage this year MLB just has to give the trophy to us. After 100 years that is the least we deserve.
Also, they play-offs are a total crap-shoot. 8 teams make it every season. The Yankees are pretty much always one of the 8. That doesn't guarantee a championship. Hell, a crappy team with 83 wins can win it all. Why spend 183 million dollars on your roster?
Hank! White!
But setting the odds on sports matches isn't really about the probablility of one team winning or losing. It's about balancing the way that people will bet. The odds are structured to minimize the risk and maximize the return of the bookmaker, based on bettor behavior.
"Moose" Morgan doesn't need to know or care whether the Yankees are likely to beat Orioles tomorrow, only what the balance will be between bets on the Yankees and Orioles. As an experienced bookmaker, Moose will naturally give favor to the Yankees from the outset. But if he's getting twice the number of bets on the Orioles to win than the Yanks, then he will shift his odds accordingly.
Because of this, the science (and math) in sports gambling comes down to finding the inefficiencies -- i.e. figuring out where the bettors have moved the gambling odds far enough beyond the real odds that it makes a bet attractive. Meaning that if you can come up with an algorithm that is reasonably accurate and says that the Yankees:Orioles ought to be 6:5 but the betting odds are 7:2, you stand to make a killing.
This is much easier to do with something like horse racing than with baseball. With horses, you have a relatively small number of people spread across 6 to 15 or so potential winners in every race. Inefficiencies abound, in the sense that favorites often win but pay at 2:1 when realistically they should be 4:1, eg. The secret to a happy day at the track is finding the horse that should be 4:1 or 5:1 but is running at 16:1.
Jimmy knows his business, and deserves credit for that. After all why would you try to gamble and risk losing when you're smart emough to take a piece of every bet and always win?
or so my attorney said
Speaking of computerized projections, if you're at all interested in horse racing, check out http://www.desertsea.com/. Oh that and it takes some guts to predict a good season for the yankees. That's like going to a casino and rooting for the dealer.
AL East: New York Yankees
AL Central: Cleveland Indians
AL West: Los Angeles Angels
AL wildcard: either the Boston Red Sox, the Toronto Blue Jays or the Minnesota Twins
OK, so he managed to choose division winners and then say that the Wild card would come from one of THREE other teams. I don't think there's much math or stats going on here. Shouldn't he be able to pick ONE team and say they're going to win the Wild Card? This sounds more like a baseball fans prediction than a mathematical prediction.
Disagreeing with me does not mean you get to mod me troll.
... in the book Moneyball by Michael Lewis. He follows Billy Beane through a season with the Oakland As, where they beat their division even though they were outspent by nearly every other team. This prompted former Fed Chair Paul Volker to comment that Beane had found a market inefficiency. He had used such an inefficiency, but it wasn't Beane who had found it.
To do this right, however, you have to do legwork, because according to the model described in Moneyball, On Base Percentage is really what you're after, not batting average, and from a pitching/fielding perspective you want to do something more nuanced. He broke the field out into zones and provided feedback based on that. My recollection is that he didn't go into too many details about that part.
The important part was to get a $/runs scored number.
OMG, here in Europe I always enter /. on RSS (so with no tags indicated): honestly, this morning when half aspleep I understood someone had mathematically determined than US is to dominate everyone forever...
Herve S.
The more you regulate a company, the worse its products become.
What is this baseball of which you speak?
Confucius say, "Find worm in apple - bad. Find half a worm - worse."
...was the year he blew it.
Power Corrupts,Absolute Power Corrupts Absolutely, leaving one person(group)in charge is absolutely corrupt.
George and the whole Yankee's Organization ruined Baseball, there are so many teams now that have so little conceivable chance of winning the world series that the sports watching public just isn't interested anymore. Its time for a salary cap.
Power Corrupts,Absolute Power Corrupts Absolutely, leaving one person(group)in charge is absolutely corrupt.
If I flip a coin 300 times, it *should* land on head 150 times and tails 150 times. Guess what. It doesn't.
That assumes that over that time span the Red Sox spent at or near the level of the Yankees, which I don't believe is true. Before the mid-90's the Sawx were something of a penny-pinching operation. They didn't really start flexing their monetary muscle until the acquisitions of Pedro and Manny.
For that matter the Yankees' payroll only got X-box hueg six or seven years ago. It was "reasonable" when they were actually winning championships.
I was just here, where did I go?
Who the F*uck cares how wins the division! Give me the World Series Winner!!
The F*ucking AL East is a joke. Stong my ass. The division will come down to either Boston or N.Y. Oooo surprise. With the Yankees most likely winning. 110 games, they will not win.
The odds are against the Yankees of winning the World Series because they don't have pitching!
Every F*ucking year some Ass-hole picks the yankees to win and they keep failing - 7 years since the last win people.
When they finally rebuild their pitching staff and have a manager who can coach and not baby sit - then they yankees will start winning again.
We got Wang and Mussina... Pettite maybe. Igawa is an unknown quantity right now. Proctor blows hot and cold, Farnsworth is a flip of the coin. I have a soft spot for Myers cause he always seems to get Ortiz out (bwhahaha). We also have Rivera to close things out and he's the best in the business. I'm not even going to talk about Pavano. The rest, ehhh they're ok I guess. Sometimes. We have some excellent prospects in the minors so I'm HOPING pitching will improve in the coming years.
If we win 110 games I'll be pleasantly surprised.
This is total bullshit.
First off, no one has been able to predict baseball results with great accuracy, and it's not for lack of trying. There's a whole cottage industry built around baseball statistics, populated by fans and professional scouts alike, and there's been some major innovation. But there's so much chance involved, and so many factors that we just can't measure (injuries, weather, slumps, etc.), that I don't think it's even possible to generate reliable predictions. Being more right than wrong five years out of six isn't all that impressive; common sense can usually net you at least three division winners a year.
As other posters have mentioned, 110 wins would not be a safe prediction for any team in history. Even for a very good, well funded team expected to be in the running, such as the Yankees or Red Sox, a reasonable expectation is 95-100 wins per season. 110 happens, but it's rare. Especially in the somewhat competitive AL East, where the Sox and Yankees reside. Most of the big win totals come from teams that utterly dominate their divisions. (About half of MLB games are intra-division.)
Furthermore, Burkiet's "surprising findings" aren't breaking news at all. I only dabble in Sabermetrics, but even I know that batting order has been proven to not really matter all that much, and that the third slot is where your best power hitter goes. (That last bit is actually conventional baseball wisdom, and has been around forever.)
If you're interested in learning more about statistical analysis of baseball, ignore the publicity-seeking academics and look to the Society for American Baseball Research, or pick up anything written by Bill James. Michael Lewis's Moneyball is also a good place to start.
lemme see... 999X (dollars to buy the best free agents) over 2x (rest of major league baseball) = profit and championships!
plus fans who hound mere mortals out of the ballpark...
yeah, I think that might lead to better than statistical dead heats.
I hereby place my secret formula into the public domain under GPL 2. any time X > Y in any of your programs, be sure to credit me.
if this is supposed to be a new economy, how come they still want my old fashioned money?
Have you not been watching baseball under the last 2 CBA's? There is more parity now than ever. Sure there are generally about 2-3 also-rans but by and large any team with good scouting and intelligent management can compete year in and year out. Oakland, Minnesota, Florida, Arizona, Atlanta (with new management), Washington(formerly Montreal), Milwaukee, have all done an exceptional job at being competitive even with less money.
Revenue sharing and the soft-cap have helped to wonders for the competitive balance. Even teams like Tampa Bay and Kansas City look to have some potential these days.
Everyone has a hometown favorite. I grew up in NY and my family followed the Mets. Around here, people like the Devil Rays (huh?). Every team has its fans, usually close to home. However, there are Yankees fans *everywhere*.
If the Yankees make it to post-season games, you guarantee LOTS of eyes on the screen. Lots of merchandise sales and ad revenue.
The Yankees win because they're supposed to. Likewise I believe Boston won because it was time to cash in on the "curse."
Does the name Pavlov ring a bell?
And I suppose you probably had Duke picked to go to the Final Four this year, too, eh? ;-)
Well *this* mathematician predicts that the Red Sox will win the division this year. Pulling numbers out of my ass has been right more often than wrong, so my prediction meets the described standard for quality.
How do his results compare to the predictions of sports commentators, or anyone else with a lot of knowledge and experience who makes public predictions? While the idea is cool and probably has a lot of potential in the long run, "right more than wrong in 5 of 6 years" doesn't sound particularly impressive to me, or to my crazy sports fan coworker sitting next to me...
-snarkbot
If money really won the championships, the Red Sox (second-highest spenders in all of baseball, they only spend something like 5% less than the Yankees) would have won more championships than they have.
1. The Red Sox spent $120 million last year--the Angels, White Sox, and Mets were all over $100 million (and 5 other teams were over $90 million). The Yankees spent $195 million. That's a hair bigger than a 5% difference (as in, the Yankees salary was 60% higher than the Red Sox--or the Red Sox salary was 40% lower than the Yankees--or half the teams in the league made less money than the difference between the Yanks and the Sox).
The Red Sox spend a lot, but it's pretty much in line with other high-spending teams. The Yankees eclipse everyone, and have a _massive_ monetary advantage.
2. In 2003, the Yankees, Mets, Braves, Dodgers, and Rangers all spent more than the Red Sox. Prior to 2000, the Red Sox hadn't even been in the top 5 in spending in years (since 1992--and even then they were still behind the Blue Jays in addition to the Yankees in their division). Through parts of the 1990s, they were in the bottom 1/2 of the league in spending.
Basically, they didn't spend _tons_ of money when they were losing. Using them as a "money doesn't win" example doesn't make sense--since opening up the pocketbooks, they've been quite successful (4 playoff berths and 1 WS in 5 years as the #2-spending team). Heck, they got started losing in 1919 when Babe Ruth was sold as a salary and cost-cutting measure to lower payroll.
rage, rage against the dying of the light
Is he following Pascal, or what?
Still waiting on Serviscope_minor to wake up to fucking reality and realize that Jessica Price isn't going to fuck him.
George and the whole Yankee's Organization ruined Baseball, there are so many teams now that have so little conceivable chance of winning the world series that the sports watching public just isn't interested anymore. Its time for a salary cap.
Actually, they serve as proof that you CAN'T buy the World Series. They keep losing to less well funded teams that enjoy playing the game more.