How Our College Basketball Model Works

Our college basketball model is live, this time in its full form. I think this is the first time we’ve had it fully operational since Covid hit. It’s good to be back.

The outputs associated with this model can be found in these places:

Here’s how it all works, and where our thoughts stand:

How We Simulate the Games

Our model’s starting point is kenpom. We are unaware of any better system grading the current quality of individual college basketball teams. To simulate games, we assign win probabilities based on the gap between each team’s adjusted efficiency margin on kenpom, with a standard adjustment baked in to account for home-court advantage. We don’t account for tempo. We don’t account for some home courts being more advantageous than others. If a team is playing at an arena in their home city but it is not a home court of theirs (e.g., when Vanderbilt plays at Bridgestone Arena during the SEC Tournament), they do not receive home court advantage. Once we have those win probabilities, we generate a random number for each game. Which side of the win probability the random number lands on determines who wins the game in that simulation. How far the random number is from the win probability determines the game’s margin. These are calibrated to roughly match kenpom’s probabilities, but they do not exactly match kenpom’s probabilities.

We run these simulations “lukewarm,” which is to say this: We don’t keep the kenpom ratings static as the simulations go along. If a team blows out an opponent in one game in our simulation, they are expected to be better in their next game, just as it goes in the real world. We also, though, don’t run our simulations fully “hot.” When a team blows out an opponent in the real world, their kenpom rating rises more than it rises when the same thing happens in our model. Our model’s kenpom proxy is more cautious in how it reacts than the real kenpom is.

Why do we run the simulations lukewarm, rather than hot? We tried running hot simulations in 2022 during the conference tournaments, NCAA Tournament, and NIT. That process gave us misleading longterm probabilities. Kenpom is rare in that it is a good enough model that its ratings hold up well over time. Teams do move from their kenpom ratings, but they tend to move around within a certain window. When running our simulations hot, the window was unrealistically large. To calibrate our lukewarm simulations, we started with the window and worked backwards. We looked at how far teams move around in kenpom over a given period of time and then turned down the heat in our model until our longterm ratings movement was in line with that real world window.

One other thing to note, in our kenpom approximating: Unlike the real kenpom, our simulated kenpom doesn’t worry about what teams’ opponents do in other games. It functions like an elo system, not as anything Bayesian.

When the regular season games are completed, we slot teams into their conference tournaments based on their conference record. However, we don’t account for conference tiebreakers. Not right now, anyway. As the season’s end draws closer, we’ll input some manual adjustments on this front, especially when it makes a big difference (when it affects a bye, when it affects home-court advantage, etc.). For the most part, though, we’re using random tiebreakers.

Selection Process

To determine which teams make the NCAA Tournament and NIT, and to determine where their seed lines lie, we’re using the same formulas we used last year. The bulk of those formulas comes from four of the six ratings systems included on the NCAA’s team sheets. One of those four is kenpom. The other three we use are NET, KPI, and SOR.

NET, KPI, SOR

To simulate how teams’ NET ratings change over time, we again use an elo-esque adjustment system. We start with the team’s current ranking, project that onto a normalized curve to assign it a score, and then tally all the team’s kenpom movements over the simulated games and add or subtract our sum to that score.

To simulate how teams’ KPI ratings change over time, we are again relying on an elo-esque approach. Unlike our NET and kenpom approach (and unlike the real KPI), we don’t worry about margin here. For each simulated game, we create a variable labeling how surprising a result is based on the current KPI’s of the teams involved. (We do not change KPI’s between games. Those are run static, not hot or lukewarm.) If a team wins, their variable is positive. If a team loses, it’s negative. We then tally all those adjustments and add or subtract them from each team’s current KPI score, again leading us to a final ranking.

We use a similar process for SOR, except instead of worrying about incoming SOR, we use the current BPI ratings of the teams involved. (Our impression is that the SOR the committee uses, which comes from ESPN, relies on BPI.)

Are these perfect approaches? No, no, not at all. This is a major issue with our model, especially because timing dictates that we often run our model before NET, KPI, and SOR updates are published from a night’s games. We run the model overnight. NET, KPI, and SOR update in the morning. This can lead to some “rippling” in our model, in which our model hones in its reaction to a game over two days instead of immediately reacting in accurate alignment with the real world.

Once we have each team’s final NET, KPI, SOR, and kenpom ranking, we convert it back into a score projected onto a normal distribution. This is intended to replicate how a committee member seeing raw rankings for each metric will care more about those rankings than the actual grades each system gives each team. (In short: It doesn’t help the 41^st-ranked team in our model to be really close to 30^th and really far from 50^th, if the pack works that way.) Once we have these scores, we convert them into the base (or entirety) of our four selection and seeding formulas. Those bases are comprised as follows:

NCAAT Selection Score: 33% NET, 33% SOR, 31% KPI, 3% kenpom
NCAAT Seeding Score: 36% KPI, 29% SOR, 25% kenpom, 11% NET
NIT Selection Score: 31% NET, 23% SOR, 23% KPI, 23% kenpom
NIT Seeding Score: 30% NET, 30% kenpom, 20% SOR, 20% KPI

These formulas are not supposed to replicate the exact thinking of the committee. We don’t think any committee members look at kenpom and say “I’ll make that 3% of my evaluation” or look at NET and say “that gets 33%.” We find, though, that committee thinking on each front generally aligns with this sort of breakdown. They might not be basing their decisions off of SOR and NET in this proportion, but whatever criteria they use is roughly in line with a 33% NET/33% SOR approach.

From these bases, we make a few adjustments. The first is the introduction of a random number which “wiggles” each team’s four scores, all in the same magnitude and direction. This is a broad uncertainty variable, one which stands in for how right and wrong we expect our formulas to be come Selection Sunday. Beyond those, there are additional adjustments which differ between the four scores.

The NIT Seeding Score gets none of these additional adjustments. It is what it is.

The NIT Selection Score fundamentally changes in 20% of our simulations, in an attempt to reflect the greater uncertainty surrounding the NIT committee than the NCAA Tournament committee (the NIT committee is smaller, must work faster, and has undergone significant change more recently). Other elements to the NIT Selection Score, besides its base above:

In 10% of simulations, the breakdown of NET/SOR/KPI/kenpom changes to match the NCAAT Selection Score’s breakdown.
In a different 10% of simulations, the breakdown of NET/SOR/KPI/kenpom flips to be as far from the NCAAT Selection Score as our original base stands, but this time in a way that favors SOR and KPI rather than NET and kenpom. This 10% of simulations asks, “How will the NIT selection committee operate if it starts emphasizing résumé metrics to the degree it currently emphasizes predictive ratings?”
In 90% of total simulations (these are determined randomly and have no link to these 20% of simulations with alternate NIT Selection Scores), a team finishing with an overall record below .500 against Division I opponents is disqualified from receiving an NIT at-large bid. We don’t think there’s a formal rule against sub-.500 teams making the NIT as at-larges, but we haven’t seen it in our time running this blog. Also, at the moment we aren’t expecting any of the NIT automatic bids to go to sub-.500 teams. 90% is an arbitrary number.

The NCAAT Seeding Score doesn’t fundamentally change like the NIT Selection Score does, but we do add two variables:

The first is the AP Poll. We see a lot of correlation between NCAA Tournament seeding and the AP Poll, and while again, this may not be an explicit decision by the committee, it’s hard not to assign some impact to the subliminal impact of the numbers on that screen. Another explanation? The AP Poll is a reflection of public opinion. Committee members have opinions which correlate with public opinion. Regardless of how it works, our model moves teams up and down from their current AP Poll position based on wins and losses (it’s all about recency, margin and quality of victory/defeat mean nothing), then spits out a final AP Top 25 at the end. This is linked to our seedings.
The second is a team’s status as a high-major, mid-major, or low-major. High-majors (the power six conferences plus Gonzaga) receive no adjustment. Mid-majors (teams in the AAC, A-10, MWC, and WCC besides Gonzaga) receive a small negative adjustment. Low-majors (everybody else) receive a moderate negative adjustment.

The NCAAT Selection Score is the most convoluted. Things it considers:

Again, it worries about high-major status. High-majors (power six teams plus Gonzaga) receive a small boost.
It worries about nonconference strength of schedule. If a team’s NET NCSOS is 340^th or worse, they get dinged for it. We don’t simulate how NET NCSOS adjusts. We just use whatever it is on the day we run the model. (The day before we run it, rather, again getting back to simulation timing.)
It worries about Quadrant 1 wins and Quadrant 1 win percentage. If a team has five or more Q1 wins, it gets a boost. A sixth Q1 win earns an additional boost. If a team has a Q1 win percentage below .250, its score receives a deduction. If a team’s Q1 win percentage is below .150, that deduction grows larger. If a team has played zero Q1 games, that deduction is very large.
It worries about Q2 wins and Q2 win percentage, but in combination with Q1 numbers. If a team has 10 or more wins combined across Q1 and Q2, the team receives a boost. If a team has a Q1/Q2 combined win percentage of .450 or better, it receives a boost.
It worries about Q2, Q3, and Q4 losses, but only in that it gives a boost to teams with none in all three of those quadrants. If you are a bubble team with only Q1 losses, you’re going to get some love.

There are other variables we could include on the NCAAT Selection Score. Unfortunately, we’re still trying to figure out the best way to do that. This is the biggest shortcoming of our model, and one we’ll address below.

To build the brackets within the model’s simulations, we include all automatic bids and at-large bids, then line teams up in order of their seeding score in the relevant tournament. Matchups aren’t determined by bracketing principles. Each team stays on its seed line, and NIT automatic bids get first round home games, but regions are assigned randomly, the NCAAT first four out doesn’t necessarily correlate with the NIT 1-seeds (as it does in real life), and in the NIT all sixteen unseeded teams get a random first round opponent from the sixteen seeded teams.

Bracketology

That concludes our model’s work. It simulates all of that, then tells us the median cut lines on either side of the NIT, each team’s probability of winning its conference tournament, each team’s probabilities of making and winning the NCAAT and NIT, and each team’s median rank in NCAAT Selection Score, NCAAT Seeding Score, NIT Selection Score, and NIT Seeding Score.

To build our NCAAT Bracketology, we pluck each conference’s most likely automatic qualifier, then line up the at-large candidates based on median NCAAT Selection Score rank, taking the top 36 as at-larges. When we have our 68 teams, we order them by median NCAAT Seeding Score rank and build the bracket based on our best understanding of bracketing principles. If there are ties in median rank on either the selection or the seeding front, we break them by referring to each team’s mean rank.

Building our NIT Bracketology is a little more complicated.

First, we refer to the median NCAAT cut line, drawing the line accordingly. This leaves us with a few NCAAT bubble teams who show up in both brackets. These teams are in what we call the “bid thief seats.” Today, on Friday February 23^rd, Utah and Providence occupied these places. What’s happening here is that we know the NCAAT cut line will likely rise a little as bids get stolen, and we know by how much it will most likely rise, but we don’t know who will take the bids. So, we leave a few more bubble teams in our NCAAT Bracketology than is realistic, and we then include them in our NIT Bracketology as well.

Second, we assign NIT automatic bids. These are a little tricky, because we don’t ask our model to publish every team’s final NET rank in its simulations. Should we do that? Maybe. It speeds up the simulations to not include it, and we’re working with laptops here. Whether we should or shouldn’t, we don’t, so rather than find the likeliest automatic bids from each power six conference, we take the top two teams (by probability of making either the NIT or the NCAAT) from that conference who are eligible for the NIT in our setup (i.e., they’re either not in the NCAAT or they’re in a bid thief seat) and give them an automatic bid, which comes with a guaranteed top-4 seed and a guaranteed first-round home game. (Update, 3/1: We are trying to now give these to the teams with the highest median final NET rankings in our simulations who are not in our NCAAT Bracketology. Our previous approach was suboptimal, so although this is a little dicier in terms of accuracy, we’re giving it a shot.)

Third, we assign 1-seeds to the NCAAT’s First Four Out—the four teams first in line for NCAAT bids behind our median NCAAT cut line. These teams are locked into our bracket and locked into 1-seeds.

Fourth, we draw the bottom cut line based on the median NIT cut line in our simulations, and we assign at-large bids to every team above it who is not either already in the NIT (as an automatic bid, through the NCAAT First Four Out system, or via both routes). If we have a shortage of teams (we almost always do), we fill in the bracket with teams in line for NCAAT automatic bids who are also in NIT at-large territory. We go from lowest NCAAT automatic bid probability to highest as we go through this list, meaning a team 35% likely to get an NCAAT automatic bid will get the NIT nod before a team 50% likely to get an NCAAT automatic bid. (Update, 3/4: We are now basing these NCAAT automatic/NIT at-large crossovers on raw NIT probability.)

Fifth, we line teams up by median NIT Seeding Score rank, adjust the list so the automatic bids get their home games, and build the bracket in accordance with our best understanding of bracketing principles.

Now. We also have an NCAAT seed list. This used to be for convenience’s sake. Our bracketology takes a while to read because of the way our site is formatted. Our seed list is compact and digestible. It can also be more easily updated, making it useful in the face of a quick turnaround between late-night conference tournament games and early-morning conference tournament games. This year, we’re trying something new with it. Here’s what’s up:

We know our NCAAT Selection Score doesn’t account for every possible element of a team’s résumé which can affect its bubble destination. We haven’t thought of every possibility. We haven’t seen every possibility play out. There are some important variables (Q1A wins, to name one) which we know matter, but which don’t seem to affect teams’ outcomes in a linear or consistent fashion. We usually end up missing three or four bubble teams while more reputable bracketologists miss one. So, we’re going to do what those other bracketologists do: We’re going to watch what the herd does and examine résumés subjectively, guessing what the committee will do rather than estimating it purely based off our model. Our bracketologies will still be purely objective, and we won’t be doing this with the NIT (our model, weirdly, was great last year at predicting the NIT field even after stinking on the NCAAT bubble), but for the NCAAT, we’re going to use our seed list to make more guesses. We’ll always explain our decisions when the subjective seed list differs from the objective bracketology. Hopefully, through that process, we’ll learn more about how the bubble functions and acquire the knowledge necessary to improve our NCAAT Selection Score formula next year.

Vulnerabilities, Expectations

We’ve included five caveats in here. In order of their appearance:

Our game simulations use a watered-down, blurry approximation of kenpom. They do not use the real thing.
We (mostly) don’t use tiebreakers for conference tournament seeding. We (mostly) leave those random.
Our NET approximation is imperfect. Our KPI and SOR approximations are very hazy, and sometimes a day late. Our AP Poll approximation is exactly as serious and reasonable as the AP Poll itself.
Our model has a bad track record around the NCAAT/NIT bubble.
Our model’s simulations of the NCAA Tournament and NIT don’t account for bracketing principles, and they take a few other shortcuts as well, most notably not placing the NCAA Tournament’s first four out on the NIT 1-line.

Overall? I would say that you should trust our NIT Bracketology but expect it to miss a few teams on the bottom cut line. Either one or two (if the committee’s consistent to its 2022 and 2023 criteria) or four or five (if the committee changes it up in a big way). I would say that you should expect our seed list to miss two NCAAT bubble teams, and that you should expect our NCAAT Bracketology to miss three or four teams on the bubble. For probabilities? I really don’t know what to expect. I think they’re pretty good, but I tend to be very optimistic, and who knows if Cody Williams will play in the NIT if Colorado does end up making it.

Overall, this is not an especially professional model. It is progress, but there are a lot of improvements to make for next year. Thanks for joining us on the journey.

How Our College Basketball Model Works

Joe Stunardi

Leave a Reply Cancel reply

Joe Stunardi

Leave a Reply Cancel reply

Related Posts