Our college football model is “fully” live again, which mostly means we’ve started keeping most of these pages updated:
- College Football Playoff Bracketology
- College football probabilities (make playoff, win conference, etc.)
- Movelor game predictions
- FCS Bracketology
- Full 2025–26 archives (coming soon)
We’re repurposing a lot of last year’s post because not a ton has changed within the model. We made one update to Movelor and one to FPA. The first should make Movelor about a tenth of a point more accurate in the long run and the second should save us from some headaches with the committee’s “sticky” rankings. More on those later.
Section 1 – Movelor
The Name
Movelor’s name is close to an acronym, standing for Margin of victory-based elo with recruiting, and that name is to some extent self-explanatory. The only live inputs into the continuously updating formula are 1) the margin of victory or defeat in each game involving a Division I football team, 2) each team’s offseason recruiting score on the 247Sports Composite, and 3) each team’s standing in the preseason AP Poll. Overall? Movelor is a very simple system. All it cares about is which teams win or lose, where the game is played, the margin by which the game is decided, how well each team is seen to recruit, and what the press thinks of the best teams heading into the season. Despite or because of this simplicity, it performs fairly well. More on that below.
The Scale
The numerical ratings within Movelor have no minimum or maximum. Their average is roughly 0.0, meaning the average Division I football team measures out at 0.0 on this scale. The ratings are equivalent to points per game, meaning a team with a Movelor rating of 28.0 should be expected to beat an average Division I football team by 28 points on a neutral field. Please note: That’s Division I. Not FBS. Movelor covers both the FBS and the FCS.
The Movelo Parts
The elo system is the core of Movelor. Its name and much of its design come from a rating system first devised in chess. (It’s called Elo over in that world, named after its inventor, Arpad Elo.) The basic idea of elo systems is that each competitor has a numerical rating and that each rating updates with every competition the competitor completes. If the result of the competition is surprising—if the competitor wins over an opponent with a much higher rating than their own or loses to an opponent with a much lower rating than their own—the rating will swing by a broader margin than if the result is unsurprising—if the competitor wins as a significant favorite or loses as a heavy underdog. Wins improve elo. Losses worsen it.
We add on to this with a margin-of-victory component, one which operates in a logarithmic manner, meaning the difference between a 14-point victory and a 7-point victory is larger than the difference between a 28-point victory and a 21-point victory in Movelor’s eyes. The way Movelor handles margin of victory is to look at its own expectation for the margin of a given game and then, just as with the elo piece, assign a larger change in rating to teams who surprise by a lot and a smaller change to teams who surprise little. In short: If Alabama is a 14-point favorite in Movelor’s eyes and wins by 17, Alabama and its opponent will not see their ratings change very much. If Texas is a 3-point favorite in Movelor’s eyes and wins by 42, Texas and its opponent will see their ratings change dramatically.
There are better ways to measure teams’ relative performance in a game than margin of victory. Ultimately, though, we find this approach does a surprisingly good job of keeping up with the industry. So, while we’d like to refine Movelor’s game-by-game adjustments one day, that’s not a particularly urgent priority for our business.
Home field matters in college football, and we account for location of games in the initial expectations against which the margin is compared. Our best estimate is that home-field advantage is worth three points.
The one adjustment we’ve made this year is that early in the season, we’re allowing Movelor to update ratings faster. We’ve had problems in the past where we know Movelor is too high or too low on a team, but that team is so abnormal (there are more than 250 playing Division I football, you get some outliers) that conventional measurement practices fail. We’ve also had problems in the past where Movelor kind of stinks in September before starting to thrive down the stretch. To address both of those, we’ve designed Movelor to adjust 65% more dramatically to Week 0 and Week 1 results, 52% more dramatically to Week 2, 39% more dramatically to Week 3, 26% more dramatically to Week 4, and 13% more dramatically to Week 5. This will create some wilder swings in the early going, but it’ll make for a more accurate overall season. It’s reflected in our simulations as well, which probably better captures the uncertainty around conference championships and the playoff fringe early in the year.
The Recruiting Part
Movelor’s second element is its R, recruiting, and it is a small element. Last offseason, only six teams saw their rating change by a point or more based on recruiting.
Movelor looks at recruiting through the lens of a “talent score,” a weighted average of recruiting class scores from the 247Sports Composite over the last five years. It phases these scores in and out, with the class from three years ago twice as important as the classes from two and four years ago and those classes twice as important as the classes from one and five years ago. This is an extremely, extremely basic way to grade how much talent is in a program,
Once Movelor has each team’s talent score for a given season, it compares it to the same team’s talent score from last year. It then adjusts their rating accordingly. They could have had great recruiting, but if it isn’t as good as their previous recruiting was, their Movelor rating will go down. In this way, it normalizes each team’s talent level to itself. If a team has performed well without much talent, Movelor doesn’t doubt them because they lack talent again this year.
Again, this is very basic, and it’s grown more outdated since we debuted it thanks to the growth in transfers. We want to go each roster and break players out by original high school recruiting class to account for aging. This is our preferred route for refining this approach. That, though, turned out to again be too big a project for this specific offseason. So…the AP poll remains a piece of Movelor:
The Preseason AP Poll
We loathe the in-season AP poll around here. As we’ve said too many times to count, it doesn’t define what it’s ranking, instead serving as some nebulous, ever-changing combination of how good a team is and strong their résumé is, with plenty of noise thrown in surrounding good storylines. The preseason AP poll, though, is pretty good. The reporters who vote on it always know a lot, but in the preseason, they apply that knowledge consistently with one another. They use the same rubric. They rank how good they expect teams to be. They might not often pick the correct national champion (they haven’t since 2017), but the rankings are good, and they have at least a little predictive power.
So, last year we added the preseason AP poll to Movelor. The way we do it is to take every team’s rating after the recruiting adjustment and line them up in order. Then, we find the lowest-rated team ranked in the top 25. Some years, this team is rated 30th by Movelor. Sometimes, this team is rated 73rd. Wherever they fall on Movelor’s curve, we take them and every team above them and assign them an “AP Movelor rating.” This hypothetical rating is how we say Movelor would rate each team if it were fully reliant on the AP poll. In the hypothetical rating, all unranked teams take on that 25th-ranked team’s rating, with the 25 ranked teams following upwards from there in equidistant steps, steps large and small enough that the 1st-ranked team’s hypothetical AP Movelor rating is equal to Movelor’s original preseason favorite’s. (Often, these are the same team.) Once we have these hypothetical ratings, we blend them with Movelor’s original rating, post-recruiting adjustment, giving a little less than twice as much weight to the original rating, the ratio we found yields the best historic results.
Overall, this has two effects: First, it accounts for top-25 teams who experienced significant offseason turnover, like both Michigan and Washington did last year. Second, it offers some mean reversion. For example: When an FCS team blows the doors off of every other FCS team it faces in the FCS Playoffs, their absence from the ensuing season’s preseason AP poll keeps Movelor from rating them a top-ten team entering the next season. This also applies to teams with big bowl game wins, wins which are meaningful but can be misleading.
The approach certainly has its shortcomings. It offers no offseason adjustment to most of the Group of Five and the FCS. But it does improve accuracy.
Movelor’s Strengths and Weaknesses
We already talked a lot about the offseason adjustment piece, so I won’t belabor that here.
The strengths? It’s pretty accurate. Movelor generally has an average error of about 13 points per game. The closing spread in betting markets generally has an average error of about 12 points per game. ESPN’s FPI and SP+ fall in between the two.
One driving factor behind Movelor’s success is probably how it treats the FCS. Not all college football models treat different FCS opponents as different from one another. Many do, and more do than used to, but some still don’t. There is a big difference between South Dakota State and Stetson. A roughly 60-point difference, in fact, at the end of the 2023–24 season.
That FCS inclusion is another of Movelor’s calling cards. FCS football is fun, and we have numbers on it, and not everybody does. This also leads us to greatly appreciate the FCS powers. Others realize when North Dakota State cracks the national top 25 in quality despite operating with more than a third fewer scholarships than their FBS peers, but they don’t pay attention the way Movelor forces us to pay attention. This is a more subjective benefit, but it’s a benefit nonetheless.
Another driving factor behind Movelor’s success is likely that simplicity. It’s easy to overthink things. It’s especially easy to overthink things when trying to catch up to public perception. Movelor does not do this. It’s ham-handed, and in some ways, that’s probably good.
Weaknesses? Beyond the offseason adjustment and those better ways to measure relative performance:
Movelor doesn’t consider tempo, and it doesn’t break out into offensive and defensive components. These could limit its precision—short-field touchdowns after a penalty and a bad kickoff are just as important to Movelor as well-executed drives down the field—and it makes it so we can’t forecast totals or exact scores. It might overweight high-octane offenses and underweight strong defensive performances, but it’s not as simple as that: Movelor gave Iowa a lot of credit in 2023 for never letting teams put up big numbers against them.
Movelor also doesn’t look backwards. Once a result is into Movelor, Movelor never looks at that game again. So: If Penn State blasts West Virginia one week and West Virginia goes on to be a lot better than Movelor expects, Movelor won’t change its impression of Penn State’s victory based on how its impression of West Virginia has changed. All scores are final, so to speak. Again, this isn’t a terrible weakness—college football teams change in quality as the year goes on, so while some Bayesianism would be preferred, it’s easy to go too far. But we’d like to blend this in, at least a little bit. We wonder if it would help us close our gap on the markets.
Section 2 – College Football Playoff Probabilities
Movelor is solid. It gets the job done. But our playoff probabilities are where this model shines. We first launched them in 2019. They did very, very well. We’ve continued over the years since, with some small breaks in publication due to Covid-induced scheduling uncertainty.
We’ve tweaked the approach over the years, but we’ve never had to tinker with it too dramatically. Before 2023, it had never missed a playoff team.
As with Movelor, there are long-term changes we want to make to this part of the model, specifically in how it reacts to the committee’s rankings each week. But overall, it does a better job than Movelor does, relative to the market.
Changes we did make this year:
FPA, which you’ll read about below, is basically our measure of how surprising the committee’s ranking of a team is, given precedent. We used to give each team’s FPA the same degree of randomness in our model’s forward-looking simulations. We changed the model this year so that FPA swings more wildly following either a loss or a big win.
We also will no longer assume the committee is deviating the minimum amount from precedent. What does this mean? When we input the committee’s rankings into our model, we will no longer tell our model to assume that the committee only barely has the 3rd-ranked team behind the 2nd-ranked team, if that particular comparison is a surprise. Instead, it will break the rankings into segments and assume an equal linear gap between each ranking within a given segment. This is more unrealistic in some senses (gaps in rankings aren’t all equal), but it’s more realistic in that it should lead our model to expect the committee to stick with previous rankings unless something provokes them to reconsider.
Finally, we’ve told our model there’s a 25% chance that conference championship losses don’t count at all and a 50% chance that they only count half as much as they would in the regular season. Last year’s committee’s informal decision to mostly ignore those results (but only for the team who lost) seems like it could be reversed, but whatever happens, the treatment will be roughly uniform. If the Big 12 Championship loser doesn’t get punished in one simulation, neither does the Big Ten Championship loser.
How this part of the model works, how strong it is, and what you should know when referencing it:
CFP Formula
Our approach uses a core ranking formula built off of six sets of metrics:
Wins/Losses/Win Percentage
This is fairly self-explanatory. We broke it out into win percentage in addition to wins and losses to adjust for teams without conference championship games (when that was an issue), teams with games canceled by hurricanes, and—of course—the Covid season.
Adjusted Point Differential
This is more complicated.
Adjusted Point Differential (APD) is our metric which approximates the eye test, betting market odds, other advanced ratings, media and coach rankings, and all the other explicit and implicit pieces which impact committee members’ impressions of how good each team under consideration really is.
The way APD works for a given team is to look at each of that team’s opponents’ average margin of victory or defeat (using a flat number for all FCS opponents, because while Movelor knows about the differences between FCS teams, committee members generally don’t) and then compare that average scoring margin to their scoring margin against the given team. In practice: If Indiana has an average scoring differential of –10 points and Ohio State beat them by 20, Ohio State outperformed the average by ten points. Add that up for each of Ohio State’s games, average it, and you have Ohio State’s APD.
Power Four/Group of Five Status
As you may have guessed, APD overestimates the committee’s evaluation of teams outside the Power Four. It knows their schedules are worse, but it doesn’t know how much worse. So, we put a blanket adjustment into our overall CFP ranking formula which deducts from all Group of Five teams, including independents not named Notre Dame. This also accounts for any additional discounting the committee is doing, whether warranted or not. As was the case last year, Washington State and Oregon State are only receiving half a deduction right now. We still don’t know how the committee would treat them in a strong season, so we split the difference. Next year, we’ll probably guess that they’ll be full mid-majors. But that’s getting ahead of ourselves.
Power Four Conference Championship
The committee values conference titles, but it’s always been unclear if it cares about those outside the Power Four. Our formula awards a bonus to teams who win a Power Four conference. This probably won’t matter as much this year, because the top four champions will be locked into byes anyway, but if it does matter—if a Group of Five champion outranks the ACC champion, for example, and another is close—it will matter in a big way.
Three Best Wins/Three Best Losses
While strength of schedule metrics get a lot of airtime, the construction of those formulas is more arbitrary than you’d think. What ultimately seems to really matter to the committee is the quality of each team’s best wins and the quality of each team’s losses, if they have any. We add a component to this which accounts for location and margin of victory or defeat, because blowouts look worse (or better) than narrow losses (or wins).
FPA (Forgiveness/Punishment Adjustment)
Our sixth variable is FPA. FPA is inserted into our CFP formula every week CFP rankings are released. It normalizes our model’s impression of the field to where the committee says the field lies. In only three instances other than in reaction to rankings do we insert FPA into the formula. Those three instances:
The Kelly Bryant Rule: If a team loses without its first-string quarterback and that quarterback will be back for the playoff, their worst loss’s impact is halved, in accordance with how the committee treated Clemson in 2017 after the Tigers’ loss to Syracuse.
The Jordan Travis Rule: If a team loses its starting quarterback and that quarterback will not be back for the playoff, and if that team fails to cover Movelor’s spread in the majority of games played after the injury occurs, they receive an FPA deduction equivalent to the average one-ranking gap.
The Nick Saban Rule: Any SEC champion with zero or one losses must be ranked in the top four no matter what. If another one-loss Power Four champion beat them, they must also be ranked in the top four. This is handled via FPA and is not accounted for in our simulations due to the hyperspecificity of the scenario.
These are stupid little exceptions, and we’re aware of that. Basically, what we’re doing with them is saying, “We trust our model, but we’ve seen three specific scenarios where it’s been wrong, and we reserve the right to yank it around in prescribed ways if those scenarios materialize again.” It’s amateur-ish of us, but we think it serves our readers best, and we’ll always be transparent about adjustments we make to our model—the what, the why, and the how. We will continue to work on improving it so we don’t have to make stupid little exceptions.
Like we said above, we want to do a better job with FPA. Our theory when we built this model was that the committee operates more like the basketball committee—one that mostly looks at all the data at one moment in time and builds its rankings from there—than like AP voters, who conduct a horse race each year. This theory had a lot of value and helped us in the model’s early years. Recent committee treatment of 2022 Ohio State and 2023 Georgia, though, has us acknowledging that there’s still some horse race to the process.
Simulations
Our model, using the Movelor rating system, simulates the rest of the season 10,000 times, then tallies up all of the results into probabilities. Each simulation is its own unique season, and Movelor is live within each, adjusting to results as they happen. So: If in one simulation, Temple upsets Oklahoma, Oklahoma is expected to do a lot worse the rest of the season than it is in another simulation where Oklahoma blows out Navy.
Caveats
A few things to be wary of with our system:
At the moment, we have no conference tiebreakers programmed in. Those are purely random. As the season goes on, we will either fix this or adjust the randomness for known tiebreaker scenarios. This is our own fault. Always a problem with all of our models.
There’s still some subjectivity in how we introduce FPA, though we’re working to eliminate that. It’s complicated and it doesn’t make a huge difference how we do it, but we like to be transparent around here.
Our model is blindly loyal to precedent. The committee is not. The model is a little too trusting of the past.
We have not done a full, week-by-week backtest of the model throughout the whole of the playoff era, so our probabilities are uncalibrated. This is another thing we would eventually like to do—make sure that 10% of the things we say are 10% likely end up happening—but we have yet to do it.
APD is very simplistic. It works fine, but it is very simplistic. This would be a nice thing to ultimately improve upon, but do not expect any adjustment to this piece of the puzzle for a while.
Overall, this is one of the best playoff predictors publicly published, with only a few misses in its history and only two spots where we really kick ourselves (the model was going to be wrong about 2022 USC until 2022 USC exposed itself, and the model was wrong about 2024 Miami). Also? If there’s a flaw significant enough that we doubt the model, we will talk about that ad nauseum in our blogposts and give you our best read of the situation. This is a big distinction from other models, where certain media outlets will happily throw out ridiculous numbers and call it statistics. If we see a ridiculous number, we fix the issue behind it before we go back out there, and we tell you what happened. If we don’t fix it, we at least tell you it’s ridiculous and why we think it’s happening and how we plan to fix it going forward. You can count on that from us.
Section 3 – CFP Bracketology
Our wheelhouse! The process here is relatively simple:
First, we gather every team’s average final CFP ranking from our simulations. For teams who average finishing the year unranked (this is most teams), we take their playoff probability.
Second, we line up the nine individual conference favorites by average final CFP ranking, assigning automatic bids to the first five on that list.
Third, we take every team not among those five (this includes the four conference champions who didn’t get automatic bids) and line them up by average final CFP ranking. The first seven on that list are our at-large bids.
Fourth, we put the teams into the bracket based on that ranking.
Fifth, we assign teams to bowl games in the quarterfinal and semifinal round based on historic bowl ties (SEC: Sugar Bowl; Big Ten: Rose Bowl; Big 12: Sugar Bowl; ACC: Orange Bowl) and geography where historic ties are impossible to satisfy. This is done, like the rest, in accordance with CFP protocol.
The end result is a realistic bracket which is something of an “average” scenario, in that it it’s comparing every team’s average season against those of its competition.
Section 4 – FCS Playoff Probabilities & Bracketology
Our FCS model is live for the first time since 2018, and there’s been a lot of change. We don’t have access to every specific metric the committee uses, but we found, looking mostly back at last year, that we could do a passable job predicting the FCS bracket through a formula which combines total losses (bad), wins over non-Division I teams (bad), wins over FBS teams (good), Movelor rating, and Movelor Wins Above Bubble. Most of the committee’s behavior can be explained by lining teams up by losses, discounting those wins over non-D1 teams, and giving a boost to teams who beat an FBS opponent. How good the team is (Movelor) and how good their résumé is (Wins Above Bubble) do help refine the seeding, though.
Acknowledgments
We take a lot of our ideas from the elo-based models of Nate Silver and Jay Boice. Silver has also been a great example for us when it comes to transparency and accountability. We are, again, admittedly amateur-ish with this model, but his is the standard we’re trying to live up to.
Second, we’re grateful to Ken Pomeroy and Joe Lunardi, two college basketball projection pioneers. We’ve copied and stolen concepts and practices from lots of people, but probably mostly from Pomeroy, Lunardi, and Silver.
Third, we’re always indebted to collegefootballdata.com for compiling the schedule each year. There is an annual “oh shit” moment in August when we are running behind, and there is an annual “oh thank goodness” moment when we get that excel file off their website.
Last, as always, we are most indebted to you, the reader, for helping this be a little bit more than a dorky hobby. Thanks for being here.
**
