One Estimator to Rule Them All: xxFIP – Part 1: xK%
xFIP is already an awesome stat. It improves upon FIP in the notion that HR/FB in pitchers is not a skill. So xFIP is calculated as:
If you have read my article on my other ERA Estimator TIPS, you would have read about my goal to eliminate catcher framing and umpire bias from pitchers’ stats. This skill of framing should be attributed to the catcher, not the pitcher.
How can we take a good ERA Estimator like xFIP, and eliminate framing and bias? We need inputs that are not affected by framing. K’s and BB’s are the only stats affected by framing, and are both used. What needs to be done?
Expected K’s and BB’s are needed, and they need to be independent of catcher framing. K% and BB% are better to come up with expected rates for than K/IP and BB/IP since they are per batter faced, instead of per inning. There are already good xK% and xBB% formulae, but they use actual strikes and balls. Strikes and balls are affected by the catcher, so we need new xK% and xBB%s. xHR/FB will also be explored.
Today we will be determining the xK% portion of our xxFIP formula. But first, just as a little teaser into the coming parts of this series. Here are the top 10 pitchers by xxFIP in 2013:
Also, why do we care about xxFIP? What good does it do us? Well, after some preliminary tests, both out of sample (2012-2013) and in sample (2008-2012) xxFIP proves to be the most predictive estimator throughout the season. It is only beaten in the very short term by the aforementioned TIPS. But these results will be saved for later on in this series.
For xK% (and xBB%), I wanted it to change on every pitch. I also wanted it to not have a constant, and take into account every event, so that every pitch outcome (like every on-base event in wOBA) has a weight. I broke down pitches into 5 outcomes that I believed should have a role in xK%.
The first one is BIP(Balls In Play)/Pitch. A ball in play completely eliminates the chance of a strikeout right? BIP/Pitch is also a skill, and has correlation from year to year. It is also un-affected by the catcher. Check! You can calculate BIP/P yourself by adding ground balls, line drives, and flyballs together and dividing by the total number of pitches.
The next possible outcome on a pitch is a swinging strike. This is commonly known as SwStr%. It is the percent of pitches that are swung at and missed. This is obviously very important in in striking out hitters, and it is a skill for the pitchers.
The next outcome that could happen that relates is a foul ball. Foul/Pitch is a skill year to year and a foul is a strike! Fouls are not as helpful as swinging strikes however, as you can’t strike out on a foul ball. That is why these outcomes are separated.
The only other possibility is that a pitch is not swung at, but is it important where that pitch is? Yes it is, if it’s in the zone it should be a strike (good), if it’s outside the zone it should be a ball (bad). These will be called ZL(Zone-Looking)/Pitch and OL(O-Zone Looking)/Pitch. Shouldn’t ZL/Pitch be the same as a swinging strike? You can strike out on both right? Yes a looking strike should have the same value as a swinging strike, however a pitch in the zone is not always called a strike. On average, a pitch in the zone is called a ball around 14% of the time. This means ZL/Pitch should be less valuable than SwStr%. OL/Pitch is the only outcome left. You may think more balls on a pitch means less strikeouts, but the more balls thrown means more strikeouts. Why? Any pitch that is not put in play increases the chance of a strikeout, and a pitch watched out of the zone keeps the count alive, unless there are already 3 balls. Throwing a ball is not near as harmful as having the ball hit in play. There is also the fact that 7% of pitches outside of the zone are called for strikes.
I ran tests on a sample of pitchers from 2008-2012 who threw at least 200 innings. I excluded 2013 so that I would have a sample to test on that was out of the sample in which the formula was created. Here is the formula for xK% that was determined:
(-0.98*BIP + 1.3*SwStr + 0.43*Fouls + 0.72*ZL + 0.11*OL)/Pitches
There is no constant for the equation and it can be used for every year. Realistically the weights would change slightly from year to year as wOBA does, but the change is probably negligible. As you can see, the weights for each event make sense. BIP is the most harmful, while a swinging strike is the most beneficial since it is always a strike. ZL is next and more beneficial than Fouls just because you can’t strike out on a foul, but can on a ZL. OL is positive, but is near zero.
Here is the graph on the sample:
That R^2 is very large. It is larger than our xK% source that used actual strike events. Podhorzer looked at seasons with a 50IP min however, where my sample looks at at least 200 IP. Let’s see what the R^2 is in an in-sample season (say 2011) with a 50IP min is.
Wow! The R^2 is still superior to Podhorzer’s xK% even with the same IP limit. Now what about an out of sample test (2013) with at least 50IP.
Even out of the sample in which the formula was created, we see a very high R^2. This xK% is clearly the best xK% at evaluating current K%, but what about future K%?
I would expect it to do very well in predicting future K%, even better than K% itself. This is because it eliminates all other factors and only has inputs that the pitcher can control and are a skill. Every other xK% created before has never been able to beat out K% in predictability. Let’s look at the out of sample 2013 to see the predictability vs. K%.
Here is a quick and easy chart. N is the number of pitchers in the sample. TBF is total batters faced. RMSE is root mean square error and is a measure of how close each 2012 K% is to the 2013 K%. Lower RMSE is better.
Here xK% seems to be better until 180 IP. The reason that K% may be better after this is that starters remain with the same catcher who has the same framing. Some pitchers may also get a more constant bias from the umpires, but before we jump to these assumptions, here’s 2011-2012:
Now xK% is always better at predicting K% than K% is. Just remember that this is in-sample however, but it would not be foolish to say that xK% is better than K% at predicting future K% in any in-season sample.
This xK% is by far and away the best xK% to date. I have not seen one with a higher R^2, or even one that can beat K% in predicting the future!
Here are the starters in xK% in 2013 with min 150IP:
|65||Jorge de la Rosa||0.173|
In the next instalment we will look at the best xBB% for pitchers to date!