2015 FRIAS Results Update – First Third of Season
The first third of the 2015 major league season is already complete! Hard to believe; it seems like just yesterday that season preview articles were being written and the season storylines were just beginning to emerge. But here we sit. Many analysts and fans of the whole sport consider Memorial Day to be the date on which to start meaningfully referencing accrued results, in terms of both the standings and statistics of individual players. Some important numbers are stabilizing and wins banked by teams are becoming sufficient enough to matter in terms of predicting end-of-season outcomes.
The Blue Jays have dug themselves into an early hole, sitting a few games below .500 with only one starting pitcher on his way to a decent season in Drew Hutchison (his 93 xFIP- is followed up by Marco Estrada’s 117), although they lead the AL East in run differential and are by all means still in the contention discussion. The best-regarded teams in baseball, the Nationals and Dodgers, sit atop their division after overcoming some early adversity.
With a legitimate sample of innings on the books for many pitchers, I wanted to look at how my FRIAS system has performed overall, which types of pitchers it has overvalued/undervalued, and how that compares to the ZiPS and Steamer systems, which are very frequently cited and represent an aggregation of all public information. FRIAS is limited in its horizon and I’m interested to know how far using 2014 results alone and a simple approach will get us.
Fangraphs recently released new batted ball data from Baseball Info Solutions which includes quality and direction of contact. I decided to quickly create a more advanced version of the FRIAS concept, using this new data as well as interaction effects in my regressions (FRIAS only used main effects) across relevant metrics. It’s termed FRIAS2.0 in the data used in this article; for future reports and calculations I will continue to use original FRIAS since I value its simplicity and my suspicions are that any additions to the original won’t add much value. Calculations I’ve made suggest that controlling batted ball quality and direction isn’t really a pitcher skill, which agrees with the central concept of DIPS.
For the comparisons, I collected K/9, BB/9, and xFIP numbers for all pitchers with at least 10 innings pitched, through June 3rd, 2015. 10 innings is a very small sample but my intention is to see how changing the minimum affects accuracy and projection system differentiation. Also, a relatively small number of innings must be chosen in order to neutralize selection bias. Many under-performing pitchers will not throw as many innings as they were expected to and choosing an innings cutoff that excludes them will lead to the system’s precision being systemically over-estimated. The 40-inning cutoffs described in the tests below can be slightly biased for this reason, although they also happen to represent pitchers who have collected the largest samples and as such have numbers that most accurately represent their skill. It’s a catch-22 that often makes baseball analysis difficult to fully accept.
As FRIAS needs previous-season innings to make a projection, it projects far fewer pitchers than systems like Steamer or ZiPS do; this fact makes it inadequate for some purposes. In this exercise only pitchers that FRIAS projected were included.
To start, here’s an expression of the K/9 differences. RMSE is the root mean square error and MAE is the mean absolute error. Both will let us know how close the projections have been to the actual values.
Min. 10 IP
Min. 20 IP
Min. 40 IP
Steamer did best across all categories here and their edge is large enough that I’d expect it to continue. I know that things like velocity and per-pitch metrics are included in Steamer and that could set it apart. It’s hard to know exactly what mix of numbers go into ZiPS since Dan Szymborski obviously doesn’t reveal his entire process, but he’s stated that it’s based on a four-year weighted average and uses some cluster analysis.
My FRIAS numbers trailed the well-known public projections for strikeouts by an amount that’s real. If your expectations of the 2015 season were entirely predicated on what happened in 2014, you’d be doing okay. 2014 results have had an enormous impact on 2015 strikeout rates. To explore the particular stances that FRIAS took which affected their fit scores, I found the differences between the absolute error of a player’s FRIAS K/9 projection and the average of the absolute errors of their Steamer and ZiPS K/9 projections. I’ll just call this a pitcher’s ‘Diff’ going forward.
The five best ‘hits’ for FRIAS in terms of K/9 are Jason Motte, Brad Hand, Carlos Martinez, Shelby Miller and Ernesto Frieri. These are pitchers who have struck out batters at a rate disproportionately represented in their 2014 stats. Shelby Miller and Carlos Martinez are the starting pitchers in this mix who have thrown significant innings. Miller is having a bounceback year but hasn’t seen a strikeout rate bounceback and his poor strikeout indicators from 2014 have carried over. FRIAS was a huge believe in Carlos Martinez’s strikeout ability and that approach has been rewarded. FRIAS projected 9.06 K/9 while Steamer and ZiPS averaged 7.65. FRIAS is beating FRIAS2.0 across all categories, so there is some sort of overfitting occurring with regards to the predictiveness of enhanced batted ball data and strikeouts.
Here is an expression of the BB/9 differences.
Min. 10 IP
Min. 20 IP
Min. 40 IP
The results were similar to K/9 overall, although FRIAS wound up essentially equaling Steamer and ZiPS by absolute error in terms of pitchers who have thrown 20+ or 40+ innings. FRIAS bombed in projecting the walk rates for Mike Pelfrey and Aaron Sanchez, and there are reasons behind this: Mike Pelfrey pitched hurt in 2014 and threw a limited number of innings while Aaron Sanchez pitched a similarly small sample and did so out of the bullpen. A more sophisticated projection system would have picked up on this information and taken a more neutral stance on these pitchers; that is what ZiPS and Steamer did. None of FRIAS’s walk rate “hits” are nearly as impactful as Pelfrey (whose Diff was 2.54) or Sanchez (2.08): Alex Wilson (-1.63) and Alex Colome (-0.91) led the way.
Here is an expression of the xFIP differences. Since we’re looking at small samples for the pitchers all around, I decided to cross-reference the FIP projections with accrued xFIP stats instead of FIP. The theory behind FIP/xFIP suggests that the values should be identical in theory as they attempt to measure the same thing (how a pitcher controls events he impacts without the aid of the defence) and so we can take a projection system’s FIP projection as their reading for xFIP as well. HR/FB rates take nearly 10 years to stabilize for a starting pitcher, so we’re safe to assume that FIP and xFIP can be conflated in this third-of-a-season sample. Otherwise, pitchers that have allowed a disproportionate number of home runs would see their FIP skyrocket unfairly, marring what were otherwise reasonable projections. Matt Shoemaker’s 5.08 FIP is not at all characteristic of how he has pitched this season.
Min. 10 IP
Min. 20 IP
Min. 40 IP
The projections were again lumped closely together, with Steamer leading the way. ZiPS trailed in terms of absolute error but was right there by RMSE. It looks like FRIAS captures enough information to put together a very competitive FIP projection, which can be somewhat surprising. As I’ve suspected, one year carried a lot of information about a pitcher’s talent level, especially since we have per-pitch statistics. Pitchers throw thousands of pitches over the course of a season and each pitch has valuable information within.
Here is a graph of the ‘Diff’ values for xFIP that describe the absolute error of FRIAS’s projections subtracted by the mean of the absolute errors of Steamer and ZiPS. Pitchers with 40 innings pitched were considered, as I wanted the larger samples so as to draw more meaningful conclusions about the individual pitchers.
That the graph slopes positively suggests that FRIAS has had larger errors than the other systems when it comes to pitchers who have pitched very poorly. This happens for pitchers who it expected to be good/decent (Sanchez and Kendrick) as well as bad (Pelfrey). Contextual factors influence some of this and future versions of FRIAS which adjust for pitcher role and park effects changes would have better utility.
The pitchers who FRIAS was most right about, predictably, has been pitchers who experienced breakouts in 2014 or had peripherals in 2014 that belied their minor league careers. Michael Pineda, Jake Arrieta and Carlos Carrasco were all fantastic by every measure in 2014. FRIAS bought in completely and has been rewarded. Trevor May’s surface stats were poor in 2014 and his minor league record was nothing spectacular, but his peripherals indicated a pitcher with plus strikeout ability and a quantity of control that surpassed his shaky record of walks allowed. Jesse Hahn looked like a mid-rotation starter over 12 starts in 2014 and that has persisted.
To conclude, it does look like critically analyzing previous-year metrics is a valid way of deriving expectations for the future, when it comes to pitchers. FRIAS is a fairly simple reprentation of this fact. When it came to pitchers that it projected (i.e., threw 20 innings in 2014), it has been just as good as ZiPS in evaluating overall skill and is slightly behind Steamer. This is encouraging and means that I will continue to refine the process in which I derive single-season future expectations. The expanded version of FRIAS, FRIAS2.0, did not outperform the original, which I find interesting. There is a limit to how much we can assume from one season and further improvements will be made in terms of having better and more granular data to crunch instead of expanding the crunching process.
(Title photo courtesy of Keith Allison: https://www.flickr.com/photos/keithallison/)