Continuing Our ‘xSeries’ By Estimating BABIP, Using StatCast and BIS Data
(Title photo courtesy of Keith Allison, https://www.flickr.com/photos/keithallison/)
Recently, Breaking Blue has used batted ball and plate discipline data to get closer to objective performance indicators of isolated power and strikeout and walk rates. The idea of this exercise is not, as similar exercises are often interpreted as, to estimate the player’s true-talent or to project him going forward. These xSeries measures aim to inform us of how the player has played, objectively, in the past. If the metric is based on talent-oriented statistics that stabilize quickly, we should have a good idea of the player’s talent level and how he will perform going forward. The publicly available statistics are getting granular enough that the performance-based metrics we have are much more tuned to talent, and that is exciting. But the xSeries is perhaps best used as one tool of many that inform a baseball observer of how well a player is playing.
Today, we’re looking at Batting Average on Balls in Play, or BABIP. BABIP has different levels of relevance for batters and pitchers so knowing what to look for can be confusing. Pitchers generally run BABIPs that regress to league-average. Hitters for the most part cluster around the average (which is usually .300 or so) but there are many players who establish rates far above or below the average. Starling Marte and Paul Goldschmidt are perhaps real .350+ BABIP players. Hitters who hit a lot of fly balls, particularly infield flies (such as Toronto’s own Edwin Encarnacion), can have true-talent BABIPs in the .260s range.
In terms of utility, infield flies are essentially strikeouts. Some hitters who strike out infrequently but pop out a lot can have peripherals that look great on the surface (few strikeouts!) but their rate of wasted plate appearances is pedestrian all the same. Ground balls go for hits more often than fly balls, although the latter can lead to much more damaging hits, namely home runs. Speed and the velocity at which batted balls are struck are obvious important contributors to BABIP.
So, to continue the xSeries, I ran a regression involving StatCast batted ball data, the new Baseball Information Solutions data posted on Fangraphs, and basic batted ball distributions. StatCast data isn’t comprehensive but it is the important component here (other xBABIPs have been created in the past but StatCast data is brand new and there are gains to be made by knowing exit velocities and distances), so the minimum threshold for inclusion in this exercise was determined by it — 85 at-bats of StatCast data.
The metrics included in xBABIP are:
– Maximum exit velocity at which a batter has struck a ball. This may seem arbitrary, but the samples are large enough that this reasonably represents the upper limit of a hitter’s strength)
– Average fly ball or line drive exit velocity.
– FB%, GB% and IFFB% (fly ball, ground ball and infield fly rates per batted ball)
– Speed score, developed by Bill James.
– Hard%, Pull% and Oppo%. The percentage of batted balls that have been hit, pulled or struck to the opposite field, classified by Baseball Info Solutions.
Here’s an illustration of how the values matched up. It’s not really the general fit that’s important to look at, since there will naturally be a great fit as the graph is the product of a regression. Note how most players tend towards the middle area and that ‘errors’ (the distance between the points and the line) appear to be normally distributed. The adjusted R^2 value (which adjusts down for the number of covariates) is.528. Fits with better adjusted R^2 values were attainable when additional covariates were considered, but the chosen combination probably provides a better glance at talent. It would have been nice to have multiple seasons of data to compare the performance and peripherals of players from season-to-season by my metric, but we only have the limited 2015 StatCast data. This would help us isolate the peripherals that are talent-based.
And here’s the comprehensive table, with each player in the sample. Delta is BABIP – xBABIP and Quotient is BABIP/xBABIP.
|Jung Ho Kang||0.312||0.318||0.006||1.02|
|Steven Souza Jr.||0.278||0.301||0.023||1.08|
Brock Holt is our xBABIP king!
The top under-performers (players whose batted ball peripherals suggest they should be carrying higher BABIPs) by Delta are Ryan Braun, Jon Jay, Will Middlebrooks, Ryan Zimmerman and Luis Valbuena.
Braun has been a different player since being suspended for using performance enhancers a couple years ago. A chronic thumb injury has clouded his health situation since last year and he has taken a significant step back on offense. One of the very best hitters in baseball from 2007-2012, Braun now projects as a first-division player who isn’t a star. This is in large part due to BABIP. Braun was consistently producing BABIPs in the .330-.360 range but he’s at .264 right now and Fangraphs Depth Charts has him at .308 going forward. Interestingly, Braun’s batted ball peripherals suggest he has been hitting the ball well enough to get him back in that elite zone, at a .351 mark.
Luis Valbuena was obviously going to feature highly on this list as his ridiculous batting line is well-publicized, but his xBABIP still isn’t very palatable at .230. He hits an utter ton of fly balls, with an average-ish infield fly rate. He’ll keep swatting home runs but the batting average (and on-base percentage) may not quite normalize to the point that he’s an above-average hitter. Valbuena’s career path has been strange to say the least.
The top over-performers are Kris Bryant, Yoenis Cespedes, Nelson Cruz, Chris Colabello and Jimmy Paredes.
These players have BABIPs that any observer would say are obvious to regress. Cespedes has the lowest actual BABIP of the five at .358. That’s a figure that some hitters can support. But Cespedes has been slightly below-average in terms of how well his batted ball peripherals suggest he is hitting the ball. Chris Colabello’s bizarre season definitely warrants its own article and I’d like to break down his likely offensive and overall values going forward in a lengthy article sometime soon.
Next up in the xSeries, Breaking Blue will bring it all together by presenting xwOBA and xSlashLine.