Using unpaired t tests of past regular season statistical rankings to predict nothing about the Super Bowl
I am especially excited about this year’s Super Bowl.
The season has been all about the Seahawks and Broncos and it’s nice to see players rewarded for regular season success. But how do you even formulate an argument to pick the winner? Both teams finished the regular season with 13-3 records while performing similarly against playoff teams. Denver’s offense broke the single season record for points scored while Seattle’s defense is undeniably the best in the NFL. Something has to give, right?
Here I’ve run some simple analyses on 7 regular season statistics that I hypothesize are best associated with team success to determine correlates of past success in the big game. I will use these data to predict the Super Bowl winner.
I went back and looked at the past 15 Super Bowls dating back to 1999. This date was chosen as it was the year that the Minnesota Vikings broke the single season record for points scored, ushering in the era of some of the most prolific offenses in NFL history. While Minnesota would lose in the NFC championship game that year, high-powered offenses including those of the St. Louis Rams, Indianapolis Colts, New England Patriots, New Orleans Saints, Green Bay Packers and this year’s Denver Broncos would dominate the landscape of the league for the next 15 years.
For each participant in each Super Bowl between 1999 and 2013, I have assessed the team’s overall ranking in each of seven cherry-picked regular season statistical areas that I presumed had the best chance of correlating with Super Bowl success. NFL rankings instead of raw numbers were used to normalize the data. A parametric analysis was performed using an unpaired t-test to generate p-values for significance (p<0.05 is generally considered statistically significant).
GraphPad Prism software was utilized and did most of the work for me. I am not a trained statistician by any stretch of the imagination, so if my inferences are stupid, kindly inform me where I went wrong. Here you can view each plot with it’s associated p-value. Here is a sloppily produced PDF of the raw data that I used, obtained from NFL.com.
edit- I posted incorrect raw data of the composite rankings in the second PDF. Here are the corrected values.
1. Quarterback rating
Justification: The quarterback is the most important position in all of sports: This is the only position player to touch the ball on every offensive play. The ability to make favorable decisions quickly can be the difference between a beastly and lousy offense. Quarterback rating takes in multiple parameters including completion percentage, yards per attempt, touchdowns per attempt and interceptions per attempt, and uses them to gauge a quarterback’s success in all phases of his throwing game (running is not included). While many people find the formula confusing – likely because they haven’t bothered to learn it – you will see that most established quarterbacks remain at the top of the list, year after year.
Results: No statistical significance was found between the regular season passer ratings of Super Bowl winning quarterbacks and Super Bowl runners up (p=0.6205). As a matter of fact, Super Bowl losing quarterbacks had a slightly higher mean passer rating compared with winners.
2. Defensive yards allowed per game (YPG)
Justification: If your opponent cannot move the ball, they are much less likely to score.
Results: No statistical significance was found between regular season defensive YPG of SB winners and losers (p=0.8839).
3. Defensive points allowed per game
Justification: Teams can’t beat you if they can’t score. This metric takes into consideration the “bend but not break” defense that may occasionally give up big plays but can shut offenses down in the red zone.
Results: Again, no statistical significance was found (p=0.8373)
4. Yards per carry
Justification: Although the majority of a team’s yardage comes from passing the ball, the ability to run, especially with a lead, can be essential in securing a victory. A team’s yards per carry (YPC) best represents the success with which a team has running the ball.
Results: Sadly, there is once again no statistical significance between winners and losers when it comes to regular season rushing efficiency (p=0.694).
5. Turnover differential
Justification: Teams that have a high proportion of takeaways to turnovers often win the field position battle and need to do less offensively to score. Turnovers can derail even the most talented offenses.
Results: No significance with a p-value of 0.6918.
Justification: I once heard or read a statistic that teams only score on 7% of drives in which the quarterback has been sacked at least once. If you pay attention, you will notice that sacks invoke huge momentum shifts and often drastically change field position.
Results: A calculated p-value of 0.8618 indicates that a team’s regular season sack total has no correlation whatsoever with winning or losing in the Super Bowl.
7. Field goal percentage
Justification: In a league that strives for parity with a salary cap and revenue sharing, teams are so evenly matched that games frequently come down to the final play, which is often a field goal attempt. Teams with successful kickers are more likely to win close games.
Results: Despite falling short of the typically accepted cutoff for statistical significance of p<0.05, a calculated value of p=0.1384 for this parameter suggests that there is a much greater chance that regular season field goal percentage correlates with Super Bowl success compared with any of the other parameters tested.
Each Super Bowl is incredibly difficult to pick, although this one is especially challenging. Vegas currently has Denver at -3, which is essentially an admission that it’s a toss. Vegas always takes the better quarterback in these situations even if the data suggest that that may prove futile for the Super Bowl.
The Broncos and Seahawks were 1 and 2 respectively in the only category that came close to being statistically significant, field goal percentage. Matt Prater and Steven Hauschka are both very good kickers. Seattle’s composite score, obtained by averaging each of the 7 assessed statistics, was an impressive 4.6 while Denver was at 12.6, indicating Seattle’s advantage in overall team balance. The comparison between composite scores of Super Bowl winners and losers yielded a p-value of p=0.5417.
Despite any real statistically significant findings, I will still offer a prediction.
Seattle (+3) swarms Peyton and his receivers on defense and takes this one, 24-21.