Our Bracketology Is Now in “Bridge” Mode

We changed our methodology for today’s bracketology, switching into what we’re calling our “bridge” mode between the “lite” version and our full model. In this bridge state, we’re using the full model’s selection and seeding formulas, but we don’t have full simulations so we’re semi-manually and semi-haphazardly projecting the median final team sheets for each team.

Here’s how bridge mode currently works, and our reasoning behind our choices:

Predictive, Not “Where They Stand”

The brackets on our site are not where we believe they currently stand. They’re where we project them to end up. This means a few things:

NCAA Tournament automatic bids are assigned to the current conference tournament favorite.

NIT automatic bids are assigned to the current conference regular season favorites likeliest to lose in their conference tournaments. We assign twelve because this is our estimate of the likeliest number of NIT automatic bids, knowing what we know now.

One bubble team appears in both the NCAA Tournament and the NIT. This is because we project 1 as the likeliest number of bid thieves.

Bid Thieves Included

Again, we do account for bid thieves when drawing our NIT upper cut line. This creates some overlap. It is intentional, and it’s because we’re trying to give you the best idea of what the eventual bracket will look like given expected results from here.

Four Scores but Only Four Years

Our model is built primarily using data from 2018, 2019, 2021, and 2022. After we project median results for each team and assign automatic bids, we use four scores to assemble our brackets:

The first score is each team’s NCAA Tournament Selection Score. It’s primarily a mixture of NET, KPI, and SOR, with some KenPom thrown in and some curvature applied to reflect stratification, which increases as you get closer to the top-ranked team. In addition to our NET/KPI/SOR/KenPom mixture, we assign bonuses for having a lot of Q1 wins, having a lot of Q2 wins, having a strong Q1/Q2 win percentage, having no Q2/Q3/Q4 losses, and being a Power Six team; and we assign penalties for having a poor Q1 win percentage, having a record close to .500 overall, having a weak nonconference strength of schedule, and finishing the season in freefall (defined as going 3-7 or worse over one’s last ten games).

We use the first score to determine which teams receive at-large bids to our projected NCAA Tournament. It’s based on bubble decisions by the committee over these last five years.

The second score is each team’s NCAA Tournament Seeding Score. This is different from the NCAAT Selection Score, because we’ve found in backtesting that this gives us stronger precision with our predicted bracket. It doesn’t have as many bells and whistles (this tracks, intuitively, because we’d imagine the committee’s debate over whether to include teams is more detailed than its debate over where to seed teams). It includes the AP Poll. It more heavily weights KenPom and less heavily weights NET. Rather than assigning a P6 bonus, it assigns a small penalty for being a mid-major and a medium penalty for being a low-major.

We use the second score to determine the order in which teams are seeded in our projected NCAA Tournament bracket, with one exception: The last at-large seed is given to the last team in the field, which is something the committee’s seemed to do the last few years to be prepared for a late-weekend bid thief.

The third score is each team’s NIT Selection Score. This is much simpler than each of the preceding scores, and is just a weighted average of NET, KPI, SOR, and KenPom.

We use the third score to determine which teams make our projected NIT. We also remove all teams projected to finish below .500, because while we know not whether that is a rule, it sure seems like one given how the committee has operated in recent seasons.

The fourth score is each team’s NIT Seeding Score. This is simple, like the NIT Selection Score. Why are these so simple? They give us enough accuracy in backtesting while not trying to predict specific preferences of a committee that’s given much less time each season than the NCAA Tournament committee and that is more prone to big surprises.

We use the fourth score to seed teams in our projected NIT, with the lone exception being that the four 1-seeds go, in order, to the first four teams out of the NCAA Tournament, meaning those are determined by NCAA Tournament Selection Score.

We’re Learning

These formulas are subject to change as the year goes on. We’ll track that somewhere, but if we see a big deviation happening between our predicted bracket and the consensus at Bracket Matrix as the tournament gets quicker, we’ll investigate and see what’s causing it, then change our model if need be. This introduces subjectivity, which we don’t like, but the Bracket Matrix consensus has a better track record than we do, so we aren’t going to be too proud over here.

We Trust KenPom

While in bridge mode, we’re taking a very simple approach to predicting final NET/SOR/KPI rankings: We just make a portion of them—a portion proportional to how much of the pre-selection season is remaining—KenPom, and use the current ratings for the remaining portion. This is not an ideal way to do it, and will not be how our full model does it, but KenPom’s our best predictor available of how a team will play from this moment forward, so we’re using it.

Why Formulas, Why a Model

There are tons of good bracketologies out there, especially for the NCAA Tournament (there are also good NIT bracketologies out there, but there aren’t as many). What the broader bracketology world lacks is probabilities—probabilities that each team makes the tournament, probabilities regarding seeding, etc. The best way to get these probabilities is to do a Monte Carlo simulation (one where you input different random variables thousands of times), but to do that, we need a fully automated selection and seeding process. This leads us to formulas like the ones above.

How Accurate We Expect This to Be

In our backtesting over 2019, 2021, and 2022, our NCAA Tournament bracket would have been within one point of the Bracket Matrix average (using Bracket Matrix’s own scoring system) each year with the formulas we’re using. We don’t expect it to necessarily be average this year—these formulas are the best we could quickly do, having limited time and manpower, and the specific scenarios the formulas are built to handle may be different from the specific scenarios which materialize this go-round—but that’s the track record. I don’t have a great estimate for its accuracy only backtesting it over those three tournaments (RPI was still in use in 2018, so that’s harder to look at), but I’d be surprised if it was in the bottom ten percent again. It should be adequate on Selection Sunday. It should only miss a couple of teams on the bubble, and it should only have a few bad seeding whiffs.

Before Selection Sunday? Ours should be a much more accurate look at the eventual bracket than almost all these other bracketologies, for the simple reason that our approach is built to predict the eventual field and theirs are mostly built to, by the time February rolls around, predict what the field currently looks like.

As for the NIT: It’s a lot harder to say. There was a big gap between the last two normal NIT’s, and this year’s might not be perfectly normal, with the pivot away from Madison Square Garden refreshing the possibility of teams opting out. In backtesting of last year’s field, we only missed one team, and we hit all the seeds, but we also tailored the formula so it would do that. We don’t know how consistent the committee will be.

What’s Next

We have five steps between where we’re at now and the full bracketology model for this year. The first two involve simulating the rest of the games themselves, including conference tournaments. The third involves estimating NET, KPI, and SOR based on varying results. The fourth is to automate the selection process, which should be quick once we have all these ingredients. The fifth is getting the outputs on the site in a digestible package. When this is all done, our aim is to show, beyond just our brackets, each team’s median projected seed, each team’s probability of making each tournament, what threshold of wins each team needs to hit to be 90% likely to make each tournament, and various probabilities for conference championships and tournament championships. We’re also hoping to use this to push out a Bubble Watch, but that may be overly ambitious.

We’ll keep chugging on those first two steps over the next two days. Once we have them done, we’ll move onto the third, fourth, and fifth steps. I doubt this will all be done by Friday. I’m unsure if it will be done next week. Monday is our current goal, though, and I’ll try to include updates in Joe’s Notes. In the meantime, this is what we have. Thanks for bearing with us, thanks for visiting The Barking Crow, and enjoy the basketball. We’ll see you soon.

Our Bracketology Is Now in “Bridge” Mode

Joe Stunardi

Leave a Reply Cancel reply