How Our Bracketology Works – Final* Version, 2023

We’re six days away from Selection Sunday 2023, which means it’s probably past time to spell out how exactly our bracketology works this year.

Predictive, Not Reflective

The brackets on our site are predictive, meaning they’re a projection of where things will stand on Selection Sunday rather than a reflection of where they stand today. This isn’t a huge distinction this late in the season, but if your team wins a game as a favorite and doesn’t move up much (or at all), this is why.

Conference Tournament Probabilities

We run internal conference tournament probabilities to build this bracket, and we use them to determine three things:

Who is the conference tournament favorite in each conference?
What is the likeliest number of NCAA Tournament “bid thieves?”
What is the likeliest number of NIT automatic bids, and which teams are likeliest to take those bids?

How We Predict Results, How We Predict Ratings Movement

To predict individual game results, we use KenPom and KenPom alone. KenPom favorites are favorites in our model’s eyes, KenPom underdogs are underdogs in our model’s eyes, but we calculate the median finish rather than having the favorite win each game. Put otherwise: We don’t project Alabama to win every game it plays in the SEC Tournament, even though it will be favored in every individual game. We project it to hit its median expected performance, which is winning one or two games but then losing.

For quadrant wins and losses (a variable in our NCAA Tournament Selection Score, below), we do not count on NET to change. We can’t predict it accurately enough to do that, so we operate with NET rankings as they are.

We do predict slight shifts in NET, KPI, and SOR for teams’ eventual scores (again, more on those below), and we use a KenPom-based formula to do that, in conjunction with how many games are expected to remain for each team, but these projected shifts are small. The bigger things we’re predicting are what will happen with bid thieves, NIT automatic bids, and Q1/Q2 wins and losses.

How Our Brackets Are Built

Each team is assigned four different scores every time we run our model:

The first is our NCAA Tournament Selection Score. This is the most complicated of the four scores, with boosts and deductions for having a lot of Q1 wins, having a lot of Q2 wins, having a strong Q1/Q2 win percentage, having no Q2/Q3/Q4 losses, being a Power Six team, having a poor Q1 win percentage, having a record close to .500 overall, having a weak nonconference strength of schedule, and finishing the season in freefall (finishing on a 2–8 stretch or worse – this was initially finishing on a 3–7 stretch or worse but we have changed it after a further look). Its base, though, like the other scores, is a mixture of our normalized projected NET, KPI, SOR, and KenPom, calibrated to optimize accuracy when applied to the 2019, 2021, and 2022 bracketing process. All of these variables are our projected variables from before—the ones each team will hit if they perform to their median expectation from here through Sunday.

We use the NCAA Tournament Selection Score to determine which teams receive projected at-large bids in our NCAA Tournament bracketology. After automatic bids are assigned, we line up the at-large candidates in order by NCAA Tournament Selection Score, and we take the top 36.

The second score is our NCAA Tournament Seeding Score. This is different from the Selection Score, because we’ve found the committee prioritizes different things with seeding than it does with selection. It’s simpler than the Selection Score, removing all those boosts and deductions from before, using an altered ratio of our normalized projected NET:KPI:SOR:KenPom, and adding two new variables, one being the AP Poll from Monday of Champ Week, one being penalties for being a mid-major or a low-major.

We use the NCAA Tournament Seeding Score to seed our bracket, 1 to 68. The lone exception is that the last team in by Selection Score gets the last at-large spot, regardless of Seeding Score.

The third score is our NIT Selection Score. This is simply a weighted average of our normalized projected NET, KPI, SOR, and KenPom, with the lone additional variable being whether a team is projected to finish below .500.

We use the NIT Selection Score to determine which teams are projected to make the NIT. After taking automatic bids, any teams in Bid Thief Seats, and the teams we expect the NCAA Tournament committee to label the “First Four Out” (earning them NIT 1-seeds), we line up the remaining teams in order of NIT Selection Score, then take enough to fill the 32-team bracket. We don’t take teams in line for automatic bids as at-large bids here, and we don’t take teams projected to win their conference tournament if they aren’t also projected as one of the likeliest NIT automatic bids. How this works, in practice: Today, on Tuesday March 7^th, Oral Roberts is in NIT territory by NIT Selection Score, but they’d be an automatic bid if they made it, and they aren’t one of the likeliest NIT automatic bid teams. So, they’re excluded. Similarly, College of Charleston is excluded, because they’re projected to win the CAA Tournament, even though they would be a strong at-large candidate were they to lose. UC Irvine is the Big West Tournament favorite, but they still receive an automatic bid in our projection because their 71% probability of losing the Big West Tournament is among the twelve highest NIT auto-bid probabilities.

The fourth score is our NIT Seeding Score. This is, again, a weighted average of our normalized projected NET, KPI, SOR, and KenPom. It determines who receives 2, 3, and 4-seeds plus which teams are unseeded, with the NCAA Tournament committee in charge of choosing the 1-seeds.

How Accurate We Expect This to Be

In our backtesting over 2019, 2021, and 2022, our NCAA Tournament bracket would have been within one point of the Bracket Matrix average (using Bracket Matrix’s own scoring system) each year with the formulas we’re using. We don’t expect it to necessarily be average this year—these formulas are the best we could quickly do, having limited time and manpower, and the specific scenarios the formulas are built to handle may be different from the specific scenarios which materialize this go-round—but that’s the track record. I don’t have a great estimate for its accuracy only backtesting it over those three tournaments (RPI was still in use in 2018, so that’s harder to look at), but I’d be surprised if it was in the bottom ten percent again. It should be adequate on Selection Sunday. It should only miss a couple of teams on the bubble, and it should only have a few bad seeding whiffs.

As for the NIT: It’s a lot harder to say. There was a big gap between the last two normal NIT’s. We have very little data with which to work. In backtesting of last year’s field, we only missed one team, and we hit all the seeds, but we also tailored the formula so it would do that. We don’t know how consistent the committee will be with itself.

Why Formulas, Why a Model

There are tons of good bracketologies out there, especially for the NCAA Tournament (there are also good NIT bracketologies out there, but there aren’t as many). What the broader bracketology world lacks is probabilities—probabilities that each team makes the tournament, probabilities regarding seeding, etc. The best way to get these probabilities is to do a Monte Carlo simulation (one where you input different random variables thousands of times), but to do that, we need a fully automated selection and seeding process. This leads us to formulas like the ones above. This is our latest step towards our dreamed-of comprehensive bracketology system, with probabilities and interactive features for you, the college basketball fan looking to find what your team needs.

Caveats

There’s a chance that we’ll adjust our NCAA Tournament Selection Score or NCAA Tournament Seeding Score formula between now and Sunday. We would only do that if one or both was clearly out of line with Bracket Matrix, projecting something projected by fewer than 5% of included brackets, but we reserve the right to do that, and we want to warn you up front that we might. Our model is only built from the last five years of data, and one of those years doesn’t have any data to offer because the NCAA Tournament and NIT didn’t happen. So, we aren’t prepared for every fringe scenario. It’s possible there is a team sheet or an injury or coaching situation that will break our model. If that happens, we will adjust as we deem necessary.

How Our Bracketology Works – Final* Version, 2023

Joe Stunardi

Leave a Reply Cancel reply