If you like baseball pitching statistics, then you’ve loved the month of June. On the first of the month, Johan Santana pitched the first no-hitter in Mets history. Then a week later, the Seattle Mariners used six different pitchers to do the same thing, which tied the Major League Baseball record for most pitchers used in a no-hitter. And finally, five days after that, Giants pitcher Matt Cain threw the 22nd perfect game in Major League history.
ADVERTISEMENT |
It doesn’t take a Six Sigma Black Belt to realize it’s been a crazy month. But as a stat nerd, the question I have is, how crazy has June really been? What are the odds of throwing a perfect game and a no-hitter? (Don’t worry; it doesn’t take a Six Sigma Black Belt to figure that out, either.) But before we start, we have an important question to answer:
…
Comments
Examine your assumptions
When your statistics don't work out, it is often a good idea to examine your assumptions. So what assumptions did you make in order to do your analysis? I suspect that your findings regarding the on-base percentage for all batters against the the pitchers who pitched perfect games suggests there are factors not included in your model.
1. Independence--You assumed that the likely of striking a batter out was independent of whether or no a pitcher struck out an earlier batter. I don't know how ture this is, but I suspect that pitchers, like all of us, have good days and bad days. On a bad day a pitcher will probably strike out fewer batters than on a good day. The effect of this would be to flatten the curve, putting higher probabilities in the tails and less in the middle.
2. Psychology has no impact--Does a pitcher who enters the fifth inning with a perfect game going pitch differently than he would if he doesn't have that perfect game going? Does the coach make different calls? Do the opposing batters start responding differently? I suspect the answer is yes. Can we measure it? Probably not.
3. The occurrence of perfect games follows a Binomial distribution--Perfect games are rare events and often the occurrence of rare events is more appropriately modeled by a Poison distribution. The rarer the events, the more closely they fit the Poison distribution over the Binomial. What if we took our area of opportunity to be all of the games played in a year (not a constant so we would have to calculate the number of perfect games per some (large) number of games). You could do a u chart and first of all see if the rate of occurrence is stable or not. (I would be interested in seeing if there were clusters of occurrence that might indicate a special cause during certain periods of time.
4. Another way to analyze rare events might be to look at mean time between failures (or in this case--occurrences). How long has it been since the last perfect game? Is this gap unusually long or unusually short?
In any event, rare events are much more difficult to model than frequent events. For any one of a number of reason, not the least of which being that they are rare. It is very difficult to understand things that don'thappen very often and often they violate the assumptions made in performing ordinary statistics.
Perfect Games
Examine your assumptions
When your statistics don't work out, it is often a good idea to examine your assumptions. So what assumptions did you make in order to do your analysis? I suspect that your findings regarding the on-base percentage for all batters against the the pitchers who pitched perfect games suggests there are factors not included in your model.
1. Independence--You assumed that the likely of striking a batter out was independent of whether or no a pitcher struck out an earlier batter. I don't know how ture this is, but I suspect that pitchers, like all of us, have good days and bad days. On a bad day a pitcher will probably strike out fewer batters than on a good day. The effect of this would be to flatten the curve, putting higher probabilities in the tails and less in the middle.
2. Psychology has no impact--Does a pitcher who enters the fifth inning with a perfect game going pitch differently than he would if he doesn't have that perfect game going? Does the coach make different calls? Do the opposing batters start responding differently? I suspect the answer is yes. Can we measure it? Probably not.
3. The occurrence of perfect games follows a Binomial distribution--Perfect games are rare events and often the occurrence of rare events is more appropriately modeled by a Poison distribution. The rarer the events, the more closely they fit the Poison distribution over the Binomial. What if we took our area of opportunity to be all of the games played in a year (not a constant so we would have to calculate the number of perfect games per some (large) number of games). You could do a u chart and first of all see if the rate of occurrence is stable or not. (I would be interested in seeing if there were clusters of occurrence that might indicate a special cause during certain periods of time.
4. Another way to analyze rare events might be to look at mean time between failures (or in this case--occurrences). How long has it been since the last perfect game? Is this gap unusually long or unusually short?
In any event, rare events are much more difficult to model than frequent events. For any one of a number of reason, not the least of which being that they are rare. It is very difficult to understand things that don'thappen very often and often they violate the assumptions made in performing ordinary statistics.
Perfect Game
This is an excellent article on one of several aspects of the game of baseball that every baseball fan has wondered about! You have proven what a remarkable feat and huge accomplishment a perfect game really is. It is the human factor that defies the odds, doubling what would be expected. Thank you for enlightening me !
What about the odds of hitting four home runs in one game? Why has no one hit five home runs in one game?
Add new comment