Advanced Stats, 2018 Fantasy Football

I rarely write or ‘show my work’ on football research as it is not in my best interest to reveal anything about how I come up with my fantasy projections. But there has been a lot of discussion in the fantasy community of late on “Do Defenses Matter?” and it seemed like an opportunity to contribute to the discussion.

There are a number of ways Defenses COULD matter for projecting offense. I have tested several and incorporated the learnings into the projections. Without divulging anything, here is an overview of hypotheses in which an opposing defense could impact defense:

Rush/pass distribution (i.e., Do offenses adjust their play mix when a Defense is disparate in their ability to defend the rush/pass?)
Total plays (i.e., Do offense run more/less plays based on the quality of the Defense?)
Target distribution (i.e., Do offenses adjust their target distribution based on strength/weaknesses of defense?)
Completion and YPA Efficiency (i.e., Do offense complete more/less passes and for more/less YPA depending on quality of defense)?
TDs (i.e., Do offense score more TDs or distribute those TDs differently depending on the defense’s strengths/weaknesses?)

For this analysis, I am just going to focus on Target Distribution. I have heard defenses called ‘funnels’ because they funnel targets to certain positions either by design or talent differences across position groups. I see a lot of content produced about how many fantasy points above average a defense gives up per position and WR/CB matchups. Between all that content AND the reactions to my Twitter pal Josh Hermsmeyer’s (@friscojosh) tweets re: defenses don’t matter when projecting fantasy offensive production, I assume the majority of people believe that defenses DO matter for target distribution (Note: WR/CB matchup afficianados also would suggest defenses affect intra-position target distribution – e.g., shutdown corner will mean WR1 gets less than his share and WR2/WR3 will prosper. This test does not aim to prove/disprove that.)

Before going into the test results, some background on my general projecting philosophy:

Since I project most/all fantasy relevant stats, my research focuses on each component of fantasy production vs an end result like Fantasy points. This includes both stats/tendencies at team level and player level.
Projecting any sport (MLB and NFL in particular IMO) at the game level is humbling because there is so much variance. You are just doing your best to be the one-eyed man in the land of the blind.
The threshold for including an additional variable/adjustment in the projections can be arbitrary. For me, I am more willing to incorporate minor adjustments if they have massive sample behind them. So I would have a lower threshold for, say, the impact of being home vs road on a stat (where I can go back 15+ years) versus a more complex adjustment based on target distribution (targets don’t go as far back in my data set, target distribution behavior changes over time, etc.)
I am more about ‘parsimony’ than ‘stuffing in every variable so I can market it and/or feel good about myself”. False dichotomy perhaps but that in itself gives you a window in my thinking. The more I stuff in there, the longer it takes to process and the more that can fuck up.
I first test to find a signal. If I find a signal, I then dig in to determine how best to apply it (i.e., is an 8 game rolling average > 4 game rolling average? what happens if i weight the value of each game?, etc.)
I am game for anything that will improve the projections. One underdiscussed element of fantasy projection is that some of the most relevant calculations (like playing time) are near impossible to backtest because their factors are so temporal. So if something exists for 20% of cases and not in 80% cases AND that can be proven, I’m not opposed to incorporating exception cases. But I’m warier of them.
My focus is not the ‘truth’, narratives, or gauging the impact after the fact. All I care about is what I know before the start of the game and whether it improves the accuracy of the projection.

Here are the test details:

	Explantation
Objective	Test whether the Defense helps predict target distribution by position (RB/WR/TE) for the Offense
Timeframe For Test	Week 1 2016 – Week 5 2018 (only Weeks 5-17 used for correlation since employing rolling 4 game average). Ended up being an eerily round 800 games in the sample.
Metric Definitions	Target Share (represented as a number vs percentage) # of Targets Thrown to Position Group/Total Pass Attempts Example: 15 targets to RB / 50 passes attempted * 100 = RB has a 30 target share) +/- Target Share Difference vs League Average Target Share For a Position – League Average Target Share for a Position If team has a 15 target share for RB and the league average is 20, this is a -5.
Stat I am Trying to Project	An offense’s actual target share by position for a game adjusted to league average (so 15 target share to RB and league average is 20 is a -5)
Stats/Variables I am Using to Predict this outcome	All of the below are adjusted to league average Offense Rolling 4 game average of team’s target share distribution (note: 3 games if bye week within the 4 week period) Defense Rolling 4 game average of opponent’s target share distribution against the defense (note: 3 games if bye week within the 4 week period) Defense (Opponent Adjusted) Rolling 4 game average for Defense + Offensive Opponent’s Average target share for all their games THAT SEASON except when they faced the defense – League Average Share
How Do I Measure The Predictiveness of Each Variable?	A standard Pearson correlation test (r)

Findings (Test results can be found here. It is set to View only but you can create a copy to fiddle with it and check my math)

1) As one would expect, there are positive correlations b/w an Offense’s Actual target distribution to RB/WR/TE for game x+4 and their average for the 4 games prior (r=.32-.43). TE is the most ‘predictive’ which makes sense since that is the position of the three whose usage in the passing game is most dictated by talent + scheme.

2) The correlation of the Offense’s Actual target distribution and the Defense’s rolling 4 game average were virtually zero for RB (-.008) and minor positive correlations for WR (0.05) and TE (0.07).

3) Running a correlation using Offense and Defense, the lift provided by Defense is inconsequential (for TE, it moved the correlation from .423 to .429). To illustrate how inconsquential the impact of defense is based on the regression, I will use 2018 Week 5 KC vs Jacksonville. The rolling 4 game average for KC TE adjusted target share was +6 (Kelce!) and Jacksonville was -2.1 (so 2 percentage points below league average). The regression formula just using the offensive information would be a +3.74. Incorporating the defense it would be a +3.52. The average team throws the ball around 38 times a game. Let’s round to 40. League average is 21.1% TE targets so that’s 8.44 TE targets/game. A +3.74 would mean 24.84% or 9.9. A +3.52 would mean 24.62% or 9.848 wich is a difference of 0.052 targets. The most extreme case in the sample was there was one week the 2017 Denver defense was a +16.6 on TE targets (-23.9 on WR!). That would net to 2.3 percentage points which would equate to 0.92 targets. The most extreme example in the past 2+ years impacts the projection less than 1 additional target (and, note, it doesn’t improve the accuracy of the projection in a significant way).

4) We know that there is some noise in the defensive 4 game rolling average because, say, a defense faced Carolina (CMC), Saints (Kamara), Giants (Barkley), and Chargers (Gordon/Ekeler). Clearly you would want to adjust the RB target share down to reflect those opponents’ typical target distribution. To adjust for the offense’s bias/strengths, I adjusted the Defensive average to reflect the average of their opponents’ target distributions in all other games that year. I used that longer time period so I could still use Weeks 1-4 when I don’t have enough prior game data to determine team target distribution tendencies. Oddly, the adjusted target metric for Defense had (very mildly) lower correlations than the unadjusted target metrics indicating either a) It’s all noise or b) I fucked up the calc. I’m sharing the spreadsheet so people can test the latter.

5) You can view the spreadsheet and/or create a copy to test ‘exception’ cases but I am not seeing it. Here are two examples:

The top 20 most RB Target friendly defenses (out of 800 games) averaged a +10.6 on RB targets which is more than double the league average of about 20.5 target share for RBs. The offenses in those games came in averaging a 0.0 so a random distribution of offense. The actual average for those 20 games was -2.8 – as in the teams targeted RBs LESS than their 4 game average despite the defense being RB target sieves the past 4.
The top 20 least WR target friendly teams in the sample averaged a -14.2 so their average WR target share allowed was 43.1 vs the league average of 57.3. The average for the offense in those games came in averaging -0.2 so very close to random. The actuals were -2.7 which is about 1 less WR target/game (and they average about 22-23).

Conclusion

I see no signals that defenses predictably drive the target distribution for the offense.

Please feel free to comment here or on Twitter (@rudygamble) if you find something different in the data. I will amend this post if anyone finds something and give credit.