Using July Stats To Discern Predictive Quality of Our Batted Ball and Plate Discipline Models
(Title photo courtesy of Paul Hadsall, https://www.flickr.com/photos/paulhadsall/ . Paul Goldschmidt had the best xwOBA of any hitter entering July.)
Breaking Blue has been rolling out periodic and incremental updates to our “xSlash” project. This is an attempt to use advanced batted ball and plate discipline statistics to create expected batting lines. One of the exciting statistical stories of the 2015 season is the addition of StatCast player tracking technology, which teams have full access to and the public mostly has a batted ball trickle of (which shows up in the Gameday application and is scraped by the excellent Daren Willman’s Baseball Savant website).
Our xSlash project uses StatCast data, Baseball Info Solutions’ batted ball numbers (which were added to Fangraphs this season), and plate discipline data to produced expected rates for isolated power (ISO), batting average on balls in play (BABIP), strikeout rate (K%) and walk rate (BB%). These four rates produce the foundation for production at the plate and are enough to use for expected total production, which is illustrated with the triple slash line, weighted on-base average (wOBA) and weighted runs created plus (wRC+). Breaking Blue produced four articles earlier in the year that presented the four important expected rates on their own, with an eye towards combining them into one system. Our models have greatly improved since then — notably in simplifying them to capture more talent and less noise (adding extra variables to improve the fit isn’t a good thing if those variables don’t stabilize quickly and aren’t intimately representative of talent).
Breaking Blue embarked on this project because there was no public source that was actually using the new batted ball data in a comprehensive and constantly-updated fashion. Quoting things like average exit velocity, hard hit rate and average fly ball distance is very common in the baseball media and blogosphere but those numbers without context are hard to digest. We wanted to know what we could learn by placing these statistics and many similar ones into context.
Until now, however, we haven’t had a chance to look back at our expected rates and see what they’ve actually meant. We don’t have past years of StatCast data, so back-testing wasn’t really an option and we had limited sample data to work with. For the latter reason, I’ve tried to keep from overfitting the batted ball data.
I wanted to use July numbers to see how predictive the expected batting lines are of future performance, so I grabbed July performance from Fangraphs for all players with 60 plate appearances in the month. On June 30th, I ran an expected batting lines iteration, which includes all 2015 performance through June, save for June 30th games. These figures will be compared to actual July numbers. I’ll also compare July numbers to current projections from the Steamer and ZiPS systems, as well as preseason Fangraphs Depth Charts (which use Steamer and ZiPS) and preseason Baseball Prospectus PECOTA projections. Hitters who were included in all systems were considered.
I wanted to include preseason numbers to compare a system that is totally agnostic of pre-2015 information to a system entirely reliant on it. The expected peripherals system we have isn’t specifically set up to project future performance, but if it can accurately capture true talent, then it should be good at projecting future performance all the same.
Current projection systems (Steamer and ZiPS) include all public information and are set up to project performance performance. They probably do not incorporate StatCast data yet though. I’d expect them to perform best in this test. These systems are aware of July information since I did not manage grab their outputs at the end of June. This shouldn’t make that much of a difference since one month of data does little to counteract what is it most cases many seasons, but it will give them a small advantage.
The systems will be compared based on their wOBA outputs, using root mean square error (RMSE), and mean absolute error (MAE). PECOTA doesn’t specifically project wOBA, so I ran their ouputs through Fangraphs’ basic wOBA formula.
Here are the results! The sample was 167 hitters.
The current projections from Steamer and ZiPS pace the field, in terms of both RMSE and MAE, as expected. They’re using the most information and use that information very well. My xSlash system does improve greatly on the actual pre-July numbers, which was also expected. High-quality past peripherals are going to be more predictive of future performance than pure past performance. The errors are fairly large for all methods because the one-month sample of July is very small and thus subject to great variance. Still, the signal shines through because we’re looking at so many hitters.
What I find interesting in these July results is that xSlash performs very similarly to the preseason projections. First-half batted ball and plate discipline peripherals predict July results almost as well as comprehensive pre-season projections do.
If you’re required to choose one system to use to evaluate a hitter’s expected future performance, the leading public projection systems are definitely what you should go with. Incorporating current-season advanced batted ball data may add value though. I wanted to next check if combining xSlash with projection systems resulted in added value. So, here are the July RMSE and MAE of some composites.
None of these composites beat the marks established by ZiPS on its own. Averaging ZiPS and xSlash did not improve the projection accuracy established by ZiPS alone, in July. This was a bit of a surprise for me. Perhaps collecting Steamer and ZiPS before July started would have lessened this effect, or perhaps some sort of weighted average should be used in a composite instead of assigning the projections equal weight to xSlash. I was surprised to see that averaging Steamer and ZiPS wasn’t useful. Fangraphs averages the two in their Depth Charts calculations, believing it increases accuracy.
Averaging xSlash with preseason projections definitely seems to improve upon the accuracy possible by using each method on its own. This is a confirmation of what we’d expect. The information provided by both past and current seasons is useful.
The xSlash project is an attempt to use the advanced peripherals available to create expected batting lines. It’s useful to establish in small samples how a player has really been performing. It does a much better job of capturing talent than the raw performance numbers do. In other words, a player’s current-season xwOBA is likely to be much more representative of his skills than his actual wOBA.
The expected current-season lines don’t replace actual projections, which use more information and know how to weigh it all properly. Perhaps more work could be done in further research here on Breaking Blue to investigate how to incorporate current-season batted ball data into a projection framework.