The Curious Case of Matt Cain

A few years ago I wrote an article about Matt Cain‘s incredibly unlucky 7-16 win-loss record in 2007. Cain’s example is one of many showing that win-loss records can be affected by numerous factors outside the pitcher’s control, and that we shouldn’t use Wins to gauge pitchers’ abilities.

ERA is another stat influenced by factors outside the pitchers control. A pitcher’s home ballpark, team defense, BABIP, LOB%, and homers allowed all factor into his ERA, yet much of the variability from those stats has little to do with the pitcher himself. Thus, ERA can fluctuate greatly above and below a pitcher’s true ability, making it a poor estimator of that true ability. Translation: Don’t use it.

While Cain’s 57-62 win-loss record is hardly representative of his excellent 3.45 ERA, his 3.45 ERA is hardly representative of his apparent skill set. Perhaps karma is at work. Cain’s peripheral pitching stats–strikeout rates, walk rates and groundball rates–put him in a tier of pitchers who average about a 4.4 ERA, not the sparkling 3.45 career number he has posted. The 4.4 figure comes from Fangraph’s xFIP, a stat that predicts ERA based only on the peripheral stats listed above.

So, is Matt Cain an exception to the rule, an outlier if you will, or is he a random spike ready to come back to earth any season now? That question is impossible to answer for sure, but I can make a guess.

I looked at the 57 pitchers who have thrown enough innings in the last 3 seasons to make Fangraph’s “qualified innings” cut. I focused on the difference between each pitcher’s xFIP and ERA over that span. In other words, I want to know by how much each pitcher’s ERA is under-performing or outperforming his predicted ERA (xFIP). Though Cain has definitely outperformed his xFIP–recording a much lower ERA in the last 3 seasons than xFIP would predict–he’s not in a league of his own. Right up there with him are standout pitchers Johan Santana, Felix Hernandez and Adam Wainwright. It’s tempting, then, to argue that Cain is in a group of overachievers, all of whom have that special something that enables them to beat the averages. Not so fast, of those 57 pitchers, the bottom four (“underachievers”) go like this: James Shields, Javier Vazquez, Ricky Nolasco and Josh Beckett, all fantastic pitchers themselves. This makes it hard to argue that a certain type of pitcher is able to break the xFIP mold.

Then I made a histogram of the xFIP/ERA differences. It’s shape, seen below, is not surprisingly that of the bell curve, or normal distribution. The fact that these pitchers fall into a nice normal curve doesn’t prove anything for sure, but indicates that our measurement of over and under performing has an extremely common distribution, and could be almost completely random. Using the definition of an outlier, none of the overachieving pitchers (Cain, et. al.) were actually outliers. The only two outliers were Nolasco and Beckett on the other side.

x-axis: ERA minus xFIP,  y-axis: number of pitchers

Outperforming one’s xFIP may be an actual ability that is itself distributed normally, but I doubt that. Pitchers who record significantly lower ERAs than xFIPs almost invariably have better LOB%s, BABIPs and HR/FB rates. In my own research I have been unable to find a pattern the types of pitchers who perform well in these categories. As seen in this data set, good pitchers were on both ends of the “luck”  spectrum, and that tends to be the general case.

There’s no evidence to suggest that Cain is anything special, and I believe he is just a random spike waiting to regress. AT&T is a good ballpark for him, and the Giants defense has been well above average, so I don’t expect his ERA to jump way up over 4.00. But it wouldn’t surprise me to see something close to 4.00 next year.


4 Responses to The Curious Case of Matt Cain

  1. The reason you come to this conclusion is that most of the sabermetric community is apparently unaware of Tom Tippett’s great study of DIPS, and one of his findings was that there are players who are capable of sustaining a BABIP significantly below the .300 mean that the vast majority are doomed to regress to.

    Tangotiger has figured that a pitcher needs to throw roughly 6 seasons worth of IP in order to have enough PA to say that his BABIP is statistically significantly below the .300 mean. Cain is at the point now, at a little more than 5 seasons, where his BABIP would have to be something crazy like .350-.400 for him to regress to .300 for his career.

    Another problem is that a lot of the sabermetric orthodoxy is using flawed statistics to make “conclusive” pronouncements. Clearly, there are pitchers who are more adept at inducing weak popups that become infield flies and shallow fly balls, and while they are clearly in the category of groundballs in terms of effectiveness for the pitcher, they are categorized typically as a flyball, and thus, for example, their HR/FB is low and considered a fluke.

    Instead of just dismissing all outliers as those destined to regress to the mean, the better way is to analyze why they might be outliers. One stat available today is the percentage of infield flies, pitchers who are considered outliers, like Zito and Cain, their demise has been forecasted for years by the saber community, yet if they would just examine the pitcher’s infield fly percentage, they will find that both Zito and Cain are usually among the leaders in the majors in this infield fly percentage.

    Those who have a low percentage and don’t have a history of sustaining a low BABIP or high infield fly percentage would be those categorized as regression candidates.

  2. uoduckfan33 says:

    I’ll be sure to look at that work by Tippet. Since this article I have looked into Cain’s case a little more. Perhaps he can induce more pop-ups that the average pitcher. But he has also played in front of some pretty good defenses, and triples alley has kept his HR/FB down below what his true ability might suggest. Since 2005, the Giants have accumulated by far the most UZR runs saved of any team in the majors. While it is not a perfect stat by any means, that’s a strong suggestion that he’s gotten a little extra help. (I realize that deducing whether or not a defense makes a pitcher better vs. a pitcher makes a defense better, and to what extent is a difficult question to answer. I tend to believe in the former, but maybe that’s just what I’ve heard for so long).

    For his career, his HOME babip is .255 and AWAY is .279.
    Then his HOME HR/FB is 6.7% vs 7.3% AWAY. So even his away numbers beat the average, but not by nearly as much. Since his defense follows him wherever he goes, that might help to explain some of the 21 points his babip is below average on the road. What’s left is some combination of randomness and skill, in my opinion. It will be interesting to see how it turns out this season with Huff and Burrell in the outfield, and Torres being another year older. I realize one year would be a relatively small sample size compared the the 5+ years he’s accumulated, but it may be the start of the sabermetric prophecy, or just another year he farts on the sabermetric community.

    Thanks for your comment!

  3. Bajuju says:


    • uoduckfan33 says:

      And his home/away splits look pretty good across the board, too! I guess I’ll wipe that egg off my face…

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: