Our college football model is live, meaning that from now through the end of the season, you can check in on it to see what our computer thinks about what’s going to happen the rest of the way. It will be updated with every day of games.
The model is very much still in development. We’re working on increasing the number of simulations (so its probabilities don’t wobble so much day-to-day). We’re working on adding in all relevant tiebreakers within conferences (they’re currently random). We’re working on adding the FCS (in case NIT Stu gets Weber State fans to click on this again). We’re hoping to, within the next few weeks, add detailed likelihoods of teams playing in specific bowls. If there are other things you’d like to see, let us know, because we want this to be something you find worth your while.
The model runs on four publicly published rating systems, using those systems’ ratings of teams to simulate the remaining games in the season. The systems are weighted differently, depending on how powerful they are. Right now, those are Bill Connelly’s SP+, ESPN‘s FPI, the Massey Ratings, and the Sagarin Ratings (editor’s note: in earlier weeks in our model, we did not include SP+ due to uncertainty as to whether or not the ratings would be publicly available).
With those results in hand, we simulate College Football Playoff Committee’s selection process, using the best formula we could derive from the last five years of rankings. It incorporates the following metrics:
Wins and Losses
Self-explanatory. Winning is good. Losing is bad.
Adjusted Point Differential
This is not an original idea, but the idea behind us using it is to use a team’s performance relative to their opponents’ average results to gauge how they might stack up in the court of public opinion. While the committee isn’t supposed to incentivize margin of victory, margin of victory is very important in the college football landscape. This captures that to an extent.
Two Strength of Schedule Metrics
One of our metrics relies on Adjusted Point Differential, and is weighted slightly more heavily than the other, which looks at the winning percentage of teams’ opponents, those opponents’ opponents, and those opponents’ opponents’ opponents. Both our metrics adjust for home, road, and neutral-site games.
Conference Championships
Again, self-explanatory. One of the listed selection criteria for the committee.
Best and Second-Best Win, Worst Loss
We found, in our research, that the highest end of a team’s schedule and a team’s worst performance (or performances) bears more importance than the rest to the committee. We’ve seen teams rewarded for a solid top end of the schedule (Oregon’s victories over Michigan State and Arizona in 2014 likely helped vault them over Florida State). We’ve seen teams punished for a particularly bad loss (Ohio State’s loss to Purdue last year seemed to hold them back more than Oklahoma’s avenged loss to Texas).
There isn’t a perfect way we found to incorporate these results. Sometimes, the best three wins might be important. Sometimes, only the best win might be. There may come a year when the committee’s forced to parse between three-loss teams. But this method gets us closer than we’d be without it.
Power Five Conference Membership
There’s evidence lower down in the rankings that Group of Five teams receive less credit than their other numbers might suggest. It’s unclear whether this is because their schedules receive more scrutiny than those of their opponents, because their wins aren’t valued as highly as those of their opponents, or because of some other reason, but in running the numbers we consistently saw Group of Five teams over-ranked by our model until we added in this variable.
Loss Forgiveness, Loss Punishment
Not all losses are created equal. When Syracuse beat Clemson in 2017 with Kelly Bryant missing the second half due to a concussion, the Tigers felt no ill effects in the rankings. When Ohio State was pummeled by Iowa that same year, it was treated more severely than other losses of comparable margins to comparable opponents.
The way we incorporate this metric might be considered “cheating” by some. Our model is not built to predict when the committee will forgive a loss, or when it will punish one more severely than normal. Instead, this function will be used once we have data indicating how the committee, or public perception (which has been seen to influence the committee, as when they vaulted UCF from 11th to 9th under media pressure last year despite the only new UCF data that week being a relatively underwhelming victory over Navy), is treating certain losses already. We’ll add it in if we notice from committee rankings, or from the AP Poll earlier in the season, that a particular loss is being treated differently from how our model expects. We could approach this differently: by, say, building our model around each set of rankings, as FiveThirtyEight does. We chose to approach it this way because we think college football is trending away from the “horse race” nature of approaching rankings towards something more similar to college basketball’s more zoomed-out view of résumés. We might be wrong about that, but the end product will likely be similar.
We know some might view this approach as “cheating,” though, and we understand where they’re coming from. We’d like to enter zero manual adjustments into this model. This is a manual adjustment. In the end, we think it’s worthwhile, because in the end, we value precision and accuracy more highly than purity of a model. We want to give you the best reflection possible of what the playoff picture looks like. If that means incorporating a manual adjustment, so be it.
***
As always, reach out if you have questions about the model, or ideas, or just want to talk shop. We’re open to feedback, and to be perfectly honest, our readership at this point is small enough that we can respond to most questions and comments.