Domain: baseballprospectus.com
Stories and comments across the archive that link to baseballprospectus.com.
Comments · 20
-
Re:How about using VR for calling balls and strike
I'm in favor of this, even if it's not implemented in precisely that way. I wouldn't want augmented reality in the umpire's mask to be a distraction from making another call like a balk or catcher interference. In the test, it wasn't implemented that way. Someone else (in the test, former major leaguer Eric Byrnes) was responsible for operating the equipment and making the calls, but it actually went well. You can read about it at https://www.wired.com/2015/07/baseball-game-no-umpire/. I believe it would be an improvement because it would get rid of a lot of the biases in the calls umpires make, which are documented at http://www.fangraphs.com/community/the-2016-strike-zone-and-the-umpires-who-control-it/. That last link is a really interesting read and shows that there are a lot of biases in the calling of balls and strikes.
This also isn't something you'd want to implement right away. Pitchfx has its own errors, partially due to technical limitations that the flight of the ball is extrapolated over the last few feet before getting to the plate. While the overall biases during the course of a season are normally distributed about zero, there can be larger systematic biases over smaller scales like within the course of a game. These are discussed in quite a bit of detail at http://www.baseballprospectus.com/article.php?articleid=13109. I'm okay with a randomly distributed error of a half inch or even an inch, provided that there's no systematic bias. On the other hand, if the calibration is off and the horizontal error is two or three inches, that's pretty significant. If the goal of automating calling balls and strikes, you don't want to implement a system that has some of the same systematic biases that human umpires have. The solution is to commit to the technology and put the resources of MLB toward fixing the calibration issues with Pitchfx.
I'm absolutely in favor of doing this. But even if the "baseball purists" immediately dropped their opposition, it would take a little while to properly implement this. I think there's more involved here than, say, implementing instant replay over the course of an offseason.
-
Re:It's the Cubs
Maybe it would be cheaper if they just obtained a copy of "The Cardinal Way". http://www.baseballprospectus....
-
it's about defensive analytics
One of the great pleasures of baseball is that it generates a vast amount of data for the analytically minded to use and abuse to their heart's content.
This purchase is presumably related to MLB's recent announcement of a new system that will constantly track and measure the movement of the ball and every player on the field. Supposedly this is going to generate several terrabytes of information each game, and some team has decided to buy a Cray as a way of processing all that data. Whether that's a better idea than the proverbial Beowulf cluster I don't know, but that seems to be this team's thinking.
Most, maybe all, baseball teams have been doing some variant of advanced analytics for quite some time now. Most of this work is proprietary and secret, but there's been a lot of "open source" (or at least publicly available) work that's probably along the same lines. Sabermatricians (baseball stat people -- from "SABR', the Society for American Baseball Research) have gotten very good at measuring offense, and reasonably good at predicting hitters' future numbers. Nate Silver's PECOTA system is the most famous, but there are others that work about as well (ZiPS and Cairo being the ones I've spent time with, plus the "dumb as the monkey on Friends" system called Marcel). Pitching numbers are understood pretty well, at least as they relate to the Three True Outcomes, which are the results or a batter v. pitcher matchup that don't involve any defensive players (i.e., walks, strikeouts, and home runs).
The next great frontier of analytics is defense. There's been a lot of work in this field over the last decade, but the problem has always been in getting good data. If a ball is hit towards the shortstop and the shortstop doesn't get to it, why is that? Is it because the ball was hit too hard? Is it because the shortstop was badly positioned by his coaches? Is it because the shortstop isn't very good? Data that's not much more than "groundball to shortstop" can't really answer that question, but the new tracking system promises to answer that sort of question in full by precisely measuring reaction times, routes to the ball, and so forth. This in turn might lead to greater and greater changes in defensive positioning, different emphases in player acquisition, maybe even in-game changes based on small changes in wind patterns or whatever.
Some of what we're already learning about defense is very surprising. For example, there has been a lot of work done recently on catcher's ability to "frame" pitches, that is to make a borderline pitch look good. The most current results suggest that the pitch-framing difference between the best and worst catcher might be worth something on the order of 5 wins. That's roughly the difference between having a random scrub and an All-Star as your right fielder, and all from a catcher's ability (or inability) to fool the umpire. It's shocking.
As for what team this is, when the news first broke it was claimed that the purchasing team "would surprise most people". That rules out the teams that are well-known to be friendly to advanced analytics -- starting with the Red Sox, Yankees, Cub, and A's. The best guess I've seen is that it's the Phillies -- they have tons of cash and seem to be very behind on analytics, and seem likely to just go out and buy a supercomputer rather than have the MIT grads in their analytics department jerry-rig a bunch of Debian boxes into something cooler and weirder. -
Re:nice analysis, now try hitting one
Actually, that's not what he said:
So, what has this analysis taught me? For an ordinary pitch, the trajectory follows a smoothly curving line approximated by nearly constant acceleration. For a knuckleball, rather than a line, imagine that the trajectory is confined to lie inside a tube which itself follows a smooth curve. However, the ball is otherwise free to flutter and zig-zag within the confines of the tube. With that picture in mind, the analysis I have presented shows that the diameter of that tube is very small, on the order of a few tenths of an inch at most.
...The smoothness conclusion appears to contradict the popular belief that knuckleball trajectories are erratic and often experience abrupt changes of direction. Let me speculate that this belief is the result of the randomness of movement, both in magnitude and direction, giving rise to the perception of erratic behavior. We have all seen instances where the catcher and pitcher get their signals crossed, and the catcher has to lunge for the ball at the last moment. The catcher expects a certain movement, and the pitcher throws something with different movement. With the knuckleball, no one really knows what movement to expect, so it is not surprising that the catcher has some difficulty cleanly catching the ball and that the batter has even more difficulty hitting it.
-
Re:who?
A basement dweller hears Schilling's name and remembers that only 4.9% of the runs that Schilling allowed over his career were unearned, which is the lowest percentage for any pitcher with a long career. We, er, *they* know that this means that ERA undervalues Schilling, because preventing unearned runs is a skill -- you do it by striking out batters, not walking anyone, and getting batters to hit fly balls rather than ground balls.
Yeah, that's way more useful than being able to identify the classes of various starships.Of course it's not at all useful, which is the great beauty of it. Baseball probably has more nerd fans than any other American sport, in large part because it is an incredible generator of numerical data. Just look at Schilling's page at Baseball Reference. Look at all of those beautiful numbers!
A baseball game is largely a series of discrete events with a relatively limited number of possible outcomes. A typical baseball game has maybe 250-300 pitches thrown in it, each one of which has a measurable outcome that can be described in a relatively small number of ways. Almost every plate appearance results in one of these 27 outcomes. We have box scores recording most or all of this information going back to 1918, and various more basic levels of data back to 1871.
You can probably see where this is leading. Lots of data compiled over many years plus a relatively simple array of outcomes means that baseball is extremely well suited to the statistically inclined. There is a vast well of data to mine, you can create mathematical models of the game relatively easily, and, most importantly, many of these models do a decent job of describing what actually occurs on the field. And so now we have the vast and ever-expanding field of Sabermetrics, full of wonderful things like minor league equivalencies and extensive studies of the effects of a catcher's ability to "frame" a pitch. It's even possible to be a knowledgeable baseball fan and *never* watch a game or listen to one on the radio, but instead just look at the numbers every day. Almost like in The Matrix.
Compare this to other team sports, like football (either kind) or basketball. In them, numerous people are involved in every play and you have no good way of recording this in a way you can put on a spreadsheet. You can watch a soccer game and see that the goal was scored because a player who never touched the ball made a great run that confused the defense. But it's very difficult to record this information in a form that is friendly to data analysis. Or think of American football -- how do you make a good statistical record of the play of an offensive lineman? It's very difficult. In baseball, the fundamental interaction is between the pitcher throwing the ball and the batter trying to hit it. It's almost binary, and everything else on the field is vastly less important. It's a game for numbers, it's a game for nerds. -
Ryan Braun is disputing a similar result
Recently Ryan Braun (rookie of the year, Major League Baseball) has been disputing a positive drug test that appears to be the same one Floyd Landis disputes, namely an abnormally high epitestosterone/testosterone ratio. In Braun's case, it appears that MLB's testing protocol involves doing a cheap but prone to false-positives first test, then a more costly and accurate second test if the first is positive. In Braun's case, what has gone horribly wrong is that the results of his first test (positive) were leaked BEFORE the second test was run. Now everyone has lawyered up and the assclowns who run MLB have some explaining to do. This is discussed at length with all available public info here:
Braun Banned for PEDs
What does this have to do with Floyd Landis? Just that epi/natural testosterone comparisons aren't cut and dried, and that the French do like to find winning non-French bikers to be dopers, and under the French Napoleonic code of justice you are guilty until proven innocent. -
Re:The strike zone *is* subjective, though.
There's no way to trick the umpire into giving you a smaller or undefined strike zone.
This, sadly, is not true. A study at the Hardball Times showed that umpires vary by up to about 5% from the league average in terms of the number of strikes they call, which suggests that human umps aren't so good at calling the defined zone. Umpires have idiosyncratic strike zones, based on personal interpretation of the strike zone, and often on what they can see based on where they set up behind the catcher. Some call strike zones wildly different from the rule book one (most famously, Eric Gregg consistently calling strikes on pitches a foot off of the plate in game 7 of the 1997 NLCS). There have always been batters who tried to shrink the strike zone through an exaggerated crouch before the swing. Rickey Henderson might be the best known recent example of this. Catchers have always attempted to "frame" pitches to convince the ump that a borderline pitch was a strike. It's been the traditional belief that this is one of the most important skills a catcher can have, and an outstanding recent study demonstrated that catchers can indeed succeed in this. The study suggests that the best at framing pitches (Jose Molina) saves about 60 runs a year over the worst at it (Jorge Posada or Ryan Doumit). 60 runs in a season is worth about 6 wins, using the standard sabermetric translation of 10 runs of player value equalling 1 win. 6 wins is a massive swing, the equivalent of replacing a player of little better than AAA quality player with an All Star, maybe the 2011 performance of Evan Longoria or Adrian Beltre. Even if you compare Molina not to a bad catcher but to an average one (and thus, theoretically, the correct or at least average strike zone) , you find that he's still winning 3 or 4 extra games by framing pitches, the equivalent of upgrading an average player to Longoria or Beltre. The thing is, Jose Molina pulls it off entirely by tricking the umpire.
Now, framing pitches and fooling umpires has a long history and is very much a part of the fabric of the game. You can argue that it rewards a player like Jose Molina who has a real and now measurable skill, and penalizes players like Doumit and Posada who are poor at this important aspect of their jobs. Thus there isn't any real moral imperative to get rid of human umpires, other than the worst ones. But you can't argue that human umps can't be tricked, and you can't argue that human umps successfully call the subjective strike zone. -
Re:Slashdotters talk baseball
I'm surprised I don't see more on
/.; baseball depends heavily on a very controlled environment (batter vs pitcher) and is accessible to extensive statistical analysis. For those interested, I recommend Baseball Prospectus, Baseball Think Factory, the Society for American Baseball Research (SABR), and the writings of Bill James, the great modern popularizer of the statistical analysis of baseball (I think of him as the Bruce Schneier of baseball -- very insightful, clear analysis)./. has enough problems, it doesn't need to become Professor Frink's crew hanging out in the back of Moe's: http://en.wikipedia.org/wiki/MoneyBART
^ ...everyone cared so much about the Banksy couch gag that no one watched the episode :P -
Slashdotters talk baseball
This is pretty funny. If we were talking about Halo, we wouldn't see so many naive claims and theories, and so many of them moderated up! Instead of replying to each one, let me clarify a few points:
A major league batter knows the base he'll likely reach as soon as he knows where the ball will land. Having seen many thousands of hits, he can make a pretty good judgement pretty quickly. I've merely watched the games, and I can tell you well before the ball lands. It's all done without any math or calculations, if you can believe it, just rules of thumb based on experience:
* Over the center-fielder's head is a triple
* Reaching the wall elsewhere: a double
* Doesn't get by the outfielders: a single.There are variables from that 'baseline': The defense could make a play on another baserunner, giving the batter the chance to get another base. Fielding mistakes, and sometimes a hard hit, a very fast/slow runner, or a very good/bad arm can make a difference of a base, but it's rare.
For the other question, I really don't know for sure. Baserunners are regularly outside the baselines, but I've rarely seen a baserunner go that far out unless he was avoiding a tag, taking out a fielder in a double-play, or over-running first base. But they sometimes round bases pretty widely without being called out. The rules are more complicated than they appear and the umps have discretion. I don't know for sure, but I doubt they'd be called out unless they were avoiding a tag or interfering with a fielder. I wouldn't depend on an answer that didn't come from an umpire.
I'm just a long-time avid baseball fan. I'm surprised I don't see more on
/.; baseball depends heavily on a very controlled environment (batter vs pitcher) and is accessible to extensive statistical analysis. For those interested, I recommend Baseball Prospectus, Baseball Think Factory, the Society for American Baseball Research (SABR), and the writings of Bill James, the great modern popularizer of the statistical analysis of baseball (I think of him as the Bruce Schneier of baseball -- very insightful, clear analysis). Now, back to your regularly scheduled News for Nerds ... -
Re:I never understand these things...
That's why we have PECOTA and similar systems to predict future player performance. The guys over at Baseball Prospectus do this all the time. This is nothing new.
-
Re:I never understand these things...
That's why we have PECOTA and similar systems to predict future player performance. The guys over at Baseball Prospectus do this all the time. This is nothing new.
-
strike calling tech in baseball
Baseball tried something similar. They decided a few years ago that they'd use a computer system (Questec) to "grade" umpires' strike zone accuracy, and then tie the grading to personnel decisions.
The system works by lining up tracking devices/cameras around a predetermined zone. Big problem. The strike zone is defined "from the bottom of the batter's knees to the midpoint between his shoulders and belt as he stands in a habitual crouch." This varies from batter to batter, it varies by the batter's stance; it can't be predetermined. Even instantaneously, it's a judgement call when a 90+ mph pitch is passing by. Then there's the matter that the strike zone is meant to be called as the ball goes over the plate. The strike zone isn't a plane at the front of the plate like many casual fans think. It's a solid volume floating above the pentagonal home plate. When pitchers are throwing good curveballs and sliders, that's very tough to get right, even for a machine.
When the system first came out, it was only in a handful of parks (7? out of 30). Umpires immediately tried to adapt to the system, trying to predict what their zone needed to be to agree with often-flawed calibrations. Games in those parks were way out of the norm for awhile. Players threw tantrums (and Curt Schilling actually broke a machine) protesting the system. Now the system is in many more parks (~23) and the system is no longer in the spotlight. I believe the umps actually negotiated on what the system could and couldn't be used for (ie, personnel decisions) in their last labor agreement.
There's an editorial from the original roll-out at http://www.wired.com/news/technology/0,1282,59284, 00.html, and an inside view from an operator at http://www.baseballprospectus.com/article.php?arti cleid=3326 (not sure if this is a premium article, if you can't get to it sorry) -
Re:"fisherman"randomly searched for "baseball" on both... Microsoft's #1 results was the mlb.com website (which is what i would expect)... Google's #1 result was baseball-almanac.com doesn't really mean anything, just thought that was interesting...
I suspect it means that Google's algorithm is better, still. The baseball almanac site has a tremendous quantity of baseball trivia, statistics, and anecdotes. Records on the site date back to the nineteenth century.
The Major League Baseball site (Google's second result, by the way) is certainly an important result, too, but I wouldn't be surprised if the real baseball addicts found the almanac site more useful, and linked to it more often.
I note that Google (correctly) returns the Major League Baseball site as the first result for searches for 'major league baseball' or 'mlb'.
Incidentally, it seems that Microsoft is continuing to update their index and/or tweak their algorithm. The current top results are now to The Baseball Archive, Baseball Prospectus...and so forth. The Major League Baseball page is actually on the second page of hits, after a number of significantly less relevant results....
Of note--if you click on a result, you get bounced through msn.com servers first. I presume that they're using that information to refine their search algorithm.
-
Re:Most Geek Sport - I think not
You might want to check out cricket, www.cricinfo.org and Wisden for some serious stats.
Perhaps it is you who needs to be enlightened. A brief look at the stats glossary at Baseball Prospectus might show you just how far out the geekier baseball fans are willing to go. Some other sites of interest include Baseball Reference, which contains complete statistics for every player ever to appear in a major league game, and Retrosheet, an organization attempting to gather historical play-by-play information on every game in MLB history. The detail put into these things is frightening.
-
Re:Most Geek Sport - I think not
You might want to check out cricket, www.cricinfo.org and Wisden for some serious stats.
Perhaps it is you who needs to be enlightened. A brief look at the stats glossary at Baseball Prospectus might show you just how far out the geekier baseball fans are willing to go. Some other sites of interest include Baseball Reference, which contains complete statistics for every player ever to appear in a major league game, and Retrosheet, an organization attempting to gather historical play-by-play information on every game in MLB history. The detail put into these things is frightening.
-
Baseball Prospectus
For the die-hard Baseball fans in Slashdot land, You should really be over at Baseball Prospectus
Many of the new articles are for pay subscribers, but they do have one free article a day, and all free articles archived:
Such as the Difficult Job Humans have judging the strike zone.
Hopefully this doesnt Slashdot the site, since I still have a few new articles to read tonight. -
Baseball Prospectus
For the die-hard Baseball fans in Slashdot land, You should really be over at Baseball Prospectus
Many of the new articles are for pay subscribers, but they do have one free article a day, and all free articles archived:
Such as the Difficult Job Humans have judging the strike zone.
Hopefully this doesnt Slashdot the site, since I still have a few new articles to read tonight. -
Frank White's REAL responses
Well usually Roblimo takes the highest-moderated comments and sends them to me, but I guess this will do.
1. In 1983, the GNU Project was first announced. One of its main goals was ownership of my anus. Being a man of the heterosexual persuasion, I was forced to go underground to avoid falling prey to Richard Stallman's anal invasion.
2. You probably read that on the Baseball Prospectus. Much like Slashdot, they are devoted to promoting the homosexual agenda through lies.
3. Once David Glass sells the Royals and the whole front office gets fired for gross incompetence, I'll have all the time in the world.
4. To cite either of those texts would violate copyright laws such as the DMCA and SSSCA which I and all law-abiding Americans hold very dearly.
5. Trick question, you aren't wearing pants. But your dick doesn't look big at all.
6. With a little investigation, anyone can realize that as long as Enron held to the good solid Christian values of Texas, they had a strong business. However, in the midst of the late 90's boom, they fell prey to the dark machinations of VA (Vaginas Away!) Software, Slashdot.org, and Apache. It was only a matter of time.
VA Software executives have been photographed frantically applying for MCSE classes, but most certification schools do not accept used condoms and old pizza boxes as payment. The writing is on the wall.
7. If by "that" you mean CowboyNeal's shriveled cock, forget it. I have enough karma already.
8. Everything was going well in China until Linux came along. Will they ever turn around and behold the warm sun of capitalism?
9. You're lucky I only have one question left.
10. In today's Washington Post Classifieds, I noticed an ad from a gentleman in Michigan who seeks some sort of Sodomite rendezvous. Clearly this man has visited the "Geek Compound" and tasted its dark delights. One can only assume this man supports the Kyoto Protocol and other anticapitalist government programs, and has not read The Kyoto Killers.
Now, I'm off to put on my full-body armor and watch Chuck Knoblauch practice his throwing. -
Re:My asshole burns with the fire of Hades
I do believe that Willie Wilson now operates the King George Inn in Warren NJ. I currently serve as special assistant to general manager Allard Baird. Baird is, of course, a horrible GM, as Rany Jazayerli and Rob Neyer have written time and time again.
-
Ugh
You fucking asshole. You probably work for the Baseball Prospectus.