The NCAA Selection Procedure for 1999-2000

Preamble

For the better part of the past decade, the NCAA Division I hockey tournament has been seeded largely by statistical analysis. For the knowledgeable college hockey fan, this means not only more confidence that the selection and seeding is being done fairly without shady back-room deals, but also a chance to predict in advance how the tournament will be seeded. One impedement to this used to be the difficulty in finding out ahead of time what rules the selection committee was applying to make their decisions, but in March 1997 a major blow was struck for public education on the process. After the announcement of the tournament pairings for that year, Selection Committee chair Joe Marsh provided a detailed explanation of how it was done to Adam Wodon of US College Hockey Online, a web site devoted to college hockey. That kicked off a detailed effort to educate coaches and the public about the selection process on the part of the committee, and while some of the details of the 1999 selections were left unexplained, we still have a reasonable description of the process from Marsh's original interview, supplemented by the announcement of the changes introduced for 1999 and occasional inquiries to the NCAA.

The basics

First of all, from the NCAA's point of view, only official games played between established Division I programs count towards the selection process. This season, those teams are

West

WCHA

Alaska-Anchorage
Colorado College
Denver
Michigan Tech
Minnesota State-Mankato
Minnesota
Minnesota-Duluth
North Dakota
St. Cloud State
Wisconsin

CCHA

Alaska-Fairbanks
Bowling Green
Ferris State
Lake Superior State
Miami
Michigan
Michigan State
Nebraska-Omaha
Northern Michigan
Notre Dame
Ohio State
Western Michigan

College Hockey America

Air Force (the fine print)

East

ECAC

Brown
Clarkson
Colgate
Cornell
Dartmouth
Harvard
Princeton
Rensselaer Polytechnic Institute
St. Lawrence
Union
Vermont (the fine print)
Yale

Hockey East

Boston College
Boston University
Maine
Massachusetts-Amherst
Massachusetts-Lowell
Merrimack
New Hampshire
Northeastern
Providence

MAAC

American International
Canisius
Connecticut
Fairfield
Holy Cross
Iona
Quinnipiac
Sacred Heart

College Hockey America

Army
Niagara (the fine print)

Two of the ten MAAC teams--Bentley and Mercyhurst--plus three of the six CHA teams--Alabama-Huntsville, Bemidji State, and Findlay--are still Division II this season, and games against them will not be used in the selection process.

Pairwise Comparisons

The underlying principle behind the current selection process is the pairwise comparison. One team is compared to another team based on five criteria:

Ratings Percentage Index: Also known as RPI, not to be confused with Rensselaer Polytechnic Institute. This is a weighted average measuring a team's record and strength of schedule and consisting of 0.35 times the team's winning percentage, plus 0.50 times their opponents' average winning percentage, plus 0.15 times their opponents' opponents' winning percentage. (the fine print)
Record in last 16 games: This was changed as of the 1999 tournament; previously the criterion was record in last 20 games.
Record vs. Teams Under Consideration: A Team Under Consideration, or TUC, is a team with a record of .500 or above in games that count towards the selection process, as defined above. (the fine print)
Record vs. Common Opponents: This criterion compares the overall winning percentage of the two teams in question against all (eligible) opponents who appear on both their schedules, regardless of how many times they each play those teams. (the fine print)
Head-to-head record

A team wins one point towards the comparison for each of the first four criteria, and one point for each head-to-head game in which they defeated the other team in the comparison. Whichever team gets more points wins the comparison, and if it's a tie, the team with the higher RPI wins.

Every Team Under Consideration is compared to every other TUC in this way. The total number of such comparisons won is called the Pairwise Rating (PWR--the fine print). This number can be used to rank the TUCs, and in the past it was believed that the teams were seeded in the order of these Pairwise Rankings, but that is not precisely how it's done. The PWR is used to get a rough sense of which teams are in contention for which spots, but then those teams are placed according to the pairwise comparisons among or between them. For example, if you're battling it out for the twelfth and final spot in the postseason, it doesn't matter how you compare with the fifth-rated team. Thus a two-way tie is impossible, since one team will always win the pairwise comparison. If three teams end up in an unresolvable tie (rock-scissors-paper), we go to the RPI to resolve the deadlock.

One wild card: Conference strengths

With the debut last year of the Metro-Atlantic Athletic Conference, and the resulting appearance on the scene of six teams newly eligible for the Division I tournament and playing the lion's share of their games against one another, some of the weaknesses of the RPI and PWR systems were brought to light. MAAC regular-season champion Quinnipiac finished the season ranked 12th in the RPI and with a pairwise comparison advantage over all but 9 teams in the nation, but were not included in the NCAA's field of 12. This was presumably related to the following paragraph in the NCAA News report on the Summer 1998 Division I Men's Ice Hockey Committee meeting:

In addition to revising one of its selection criteria, the committee noted that it reserves the right to evaluate each team based on the relative strength of their respective conference using the overall conference ratings percentage index (RPI) in determining competitive equity.

It's not known exactly what measure was used to determine this lack of competitive equity, but with no games last season between the MAAC and any of the four established conferences, the best information on the relative strengths of the conferences was their respective performance against the four (last season) Division I Independents, which is summarized in the following table. (The average RPI of all the teams in the conference, which may be alluded to in the paragraph above, is also included for reference.)

Conference Strength
Conference	Avg RPI	vs Indies		vs Army		vs Niagara		vs Air Force		vs MSU-Mankato
Conference	Avg RPI	PF-PA	Pct	PF-PA	Pct	PF-PA	Pct	PF-PA	Pct	PF-PA	Pct
Hockey East	.523	14-2	.875	12-0	1.000	0-2	.000	2-0	1.000	0-0	.---
WCHA	.503	32-10	.762	0-0	.---	0-0	.---	10-0	1.000	22-10	.688
CCHA	.503	5-5	.500	0-0	.---	2-4	.333	0-0	.---	3-1	.750
ECAC	.495	28-10	.737	10-0	1.000	10-10	.500	2-0	1.000	6-0	1.000
MAAC	.454	8-22	.267	5-5	.500	0-4	.000	3-5	.375	0-8	.000

At any rate, the reason for Quinnipiac's deceptively high RPI and PWR last season is no big mystery. RPI attempts to correct a team's winning percentage for their strength of schedule by mixing it with the average winning percentage of their opponents. However, if those opponents have also played abnormally weak schedules, their winning percentages will be a poor indicator of their strength, and hence of the schedule strength of the team in question. According to the more sophisticated (the fine print) KRACH rating system, Quinnipiac was rated #41 out of 52 teams. The pairwise comparison algorithm is even more fragile, as the "Last 16" and "Teams Under Consideration" criteria make no allowance for strength of schedule at all, simply comparing the teams' winning percentages in those games. Niagara, despite having a low RPI, was able to win a few key comparisons last year by accumulating good records against weak teams in their last 16 games and against teams which accumulated winning records against weak schedules.

The bottom line is that the committee is at liberty to leave CHA and MAAC teams out of the tournament on the basis of the relative weakness of their schedules, even if their pairwise comparisons would otherwise entitle them to a berth. (Unfortunately, this method cannot correct for the other consequences of RPI's shortcomings, such as the potential overvaluing of top MAAC and CHA opponents appearing on major conference teams' schedules this season.) Here is a table of each conference's average RPI and their record vs each other conference; additionally, the team with the best RPI in the conference is listed as well as the average RPI of their conference opponents.

For comparison, here is how the KRACH rating system predicts each conference would fare if each of its teams played each Division I team in each other conference once.

What do we do with the numbers?

The NCAA tournament consists of twelve teams, divided for the first round and a half into two regionals, East and West. In each regional, two teams receive first-round byes while the other four play on the first night. On the second night of the regional, the two bye teams play the two first-round winners, with the two survivors from each regional then advancing to the national semifinals the following weekend. The selection and seeding process can be divided into the following steps:

Qualification

The regular season (the fine print) and tournament champions in each of the four major conferences (WCHA, CCHA, ECAC and Hockey East) receive automatic berths, which accounts for between four and eight of the twelve teams. (The MAAC and CHA do not receive any automatic bids.) The remaining four to eight at-large teams are selected according to the pairwise method. There is the one stipulation that each major conference must have at least two representatives in the tourney.

This is one of the places where our understanding of the process is still a little lacking. We know that the committee gives "obvious" at-large bids to teams that win comparisons with the rest of the candidates, then scrutinizes the "bubble" teams by comparing them individually to one another. Usually, the precise mechanics of this process are irrelevant, but in the 1999 selections, there were between two and four conceivable sets of tournament teams depending on how the bubble was pared down. We know that Ohio State and Northern Michigan got the last two bids in that particular season, but there were a couple of different lines of reasoning that could have given that result, and the selection committee hasn't explained which one was used.

Byes

Any major conference team which wins both the regular season title in its league and its conference tournament receives an automatic first-round bye; since there are two conferences in each region and two byes in each regional, this will fill between zero and two of those slots in a given region. The other bye(s), if any, are given to the best team(s) in the appropriate region(s) according to a pairwise analysis.

Regions

There are now four remaining spots in each regional to fill with the other eight teams. If those eight teams are evenly divided, four from each region, the two better teams in each region play in their respective regionals, while the two lower teams are "shipped out" to play in the opposite region. If there is an imbalance, the bottom team(s) from the over-represented region are placed into the other region before the swap. (snotty aside) However, the host schools (Minnesota in the West and Rensselaer in the East this year) must be kept in their own regions. (the fine print) Also, see "Fine Tuning".

Seedings and Pairings

Once the four non-bye teams in each regional are determined, they are placed in the three to six positions according to their pairwise comparisons. The four and five seeds will play in the first round, with the winner to face the one seed, while the three and six seeds will meet for the right to play the two seed.

Fine Tuning

At this point, we have a setup for the tournament according to the numbers, but there could be other problems with it. For instance, all four first-round contests could be rematches of the conference title games, or the teams with the biggest fan bases could be playing outside of their regions. These are both considered undesirable by the NCAA, so the committee can shuffle things a bit, either by altering the seedings within a region, or choosing to send different teams to the opposite regionals. First-round intra-conference matchups are positively verboten, and potential second-round games between teams in the same conference should be avoided, especially if the teams met in their conference playoffs. If two teams are swapped within a region to eliminate a second-round matchup, the other two teams will be swapped as well to retain the first-round pairings, if that doesn't cause more problems. Also, teams can be shifted to different regions to increase attendance.

This is the one part of the selection procedure which is really a judgement call on the committee's part, and thus the most unpredictable. Ordering teams within a regional is basically deterministic, but when deciding which non-bye teams go in which regional, the committee is supposed to consider

pairwise comparisons
attendance considerations
avoidance of intraconference matchups

How much weight they give to each is completely unspecified, although attendance seems to be very important, while conference considerations are not a big priority in populating the regionals. The best way to guess what they'll do is to look at historical precedent.

Putting it in practice

To get a detailed blow-by-blow of how this all works, you can read my description of how the 1998 tournament was seeded, as well as a summary of the seeding decisions from 1996-1998. My prediction of the 1999 seedings ultimately guessed the committee's behavior incorrectly in a couple of places, but it contains a list of alternate possibilities for the dreaded "bubble identification" stage.

Finally, to learn interactively, there's the tournament selection script "You Are The Committee", which also offers a self-serve what-if interface that lets you change some of the results and see how that would have affected the ratings used for tournament selection.