One Estimator to Rule Them All: xxFIP – Part 2: xBB%
If you missed part 1 of the segment, I advise you read it first here! xxFIP may just even be the best estimator known to date for in-season estimation! Last week I looked at the xK% portion of the formula, and now it’s time to move on to xBB%.
Walk rates are a lot different from strikeout rates. They are very hard to come up with expected values for, especially in medium-to-large samples. This may be due to pitching around a hitter or maybe due to a pitcher losing his command altogether. The stats we use to get these expected strikeouts and walks assume that each event happens randomly (based on the rates the pitcher throws them), but in reality, balls might be bunched together more often for these reasons. Intentional walks are obviously one of these problems as well. Still xBB% can be a very useful tool, as it eliminates the catcher and the umpire.
xBB% was created with the same events (or stats) as xK%. (Balls In Play)/Pitch, SwStr%, Fouls/Pitch, (Zone-Looking)/Pitch, and (Outisde Zone Looking)/Pitch.
Here are the respective coefficients that were determined from the same sample as xK% (2008-2012):
As you can see, BIP/Pitch is the most detrimental to BB% (obviously, since a BIP means no walk) and OL/Pitch has the greatest coefficient (since it should be a ball). ZL/Pitches is the next highest as about 14% of pitches in the zone are called balls. Fouls are next since you can’t strike out on a foul which keeps the PA alive and the chance of a walk alive. Lastly, a swing and a miss is always a strike and can strike out a hitter, so it has the next-lowest coefficient.
Now, xBB% does not have as great of a correlation to BB% as we saw with xK%, but this xBB% is still the best one to date, and the most predictive.
A graph is pretty boring so why don’t I just tell you the R^2 is around 0.65 for xBB% to BB%. Not near as nice as xK%, but we want xBB% to be predictive, which means we don’t really care too much about the R^2.
We know xBB% s going to outperform BB% in small samples due to the the components of xBB% stabilizing much faster than BB’s themselves, but where is this point? In previous xBB% formulae, I have not seen anything past 50IP.
Here are the results (RMSE) of BB% and xBB% for 2012-2013 which is out of sample.
The table suggests that BB% takes over in predicability at around 150 IP, that is higher than I might have thought! Though 2011-2012 suggests it occurs around 110 IP. Either way, I would still use xBB% over a season just due to the fact that it is independent of the catcher and umpire, as well as pitching around a batter and intentional walks.
Here is the xBB% leaderboard for 2013:
|70||Jorge de la Rosa||0.089|
Now that we have xK% and xBB%, we can sub these into our trusty xFIP formula to get an idea of the simplified xxFIP formula. We don’t have xHR/FB yet, so I’ll just leave it as xHR/FB.
xFIP = 3*BB/IP -2*K/IP +13*lgHR/FB*FB + C
xK% and xBB% can be converted to IP by just multiplying by TBF (total batters faced)/IP (usually around 4.3) for each pitcher. After subbing in xK% and xBB% we get:
(0.318*BIP/P – 2.825*SwStr% – 0.834*Foul/P – 1.152*ZL/P + 1.22*OL/P)*(TBF/IP) +13*xHR/FB*FB + C
C in this equation is usually very close to cFIP (FIP constant) but it varies a bit year to year from cFIP.
If you assumed 4.3 TBF/IP we would get:
1.367*BIP/P – 12.148*SwStr% – 3.586*Foul/P – 4.953*ZL/P + 5.246*OL/P + HR term + C
A little more teaser for xxFIP for you. Here are the top 10 relievers with over 50 IP:
There will be one more part to the series where we will briefly go over xHR/FB and reveal results of xxFIP testing!