OK, a couple of points have been brought up that I am happy to address.
- The Bayesian Prior
There is nothing in my model that turns off the Bayesian prior after X number of games. That was a decision on my part based on "Bayesian Philosophy," if you will. By the end of the season, the current seasons data has way more influence on the rating than the prior based on previous seasons. So by the end of the season, the only time the prior would have any effect would be when two teams have very similar ratings based on the current season's data. At that point the prior would give the team that had a better rating last season a higher rating.
From a pure statistical standpoint, the effects of the prior are roughly equivalent of winning half a game against an opponent equal to the team's rating from last year.
On the surface, this may seem troubling. But I think it also exposes a flaw in all ranking systems. We are trying to estimate a team's true ability (what I call rating) based on a limited number of observations in which there is "observation error" (hockey games are close, and the team with the better ability does not always win) and "process error" (a team's ability is not constant, be it due to injuries, things going on at school, etc.). So trying to estimate a team's true ability based on thirty or so binary responses in the presence of all this uncertainty is a difficult task.
The fact that we say team A is better than team B based on a single point estimate based is a little troubling. As a statistician, I would never say drug A is better than drug B based on point estimates. We look at estimates of uncertainty and confidence intervals as well. That is why I think my "uncertainty page" is by far the most important part of my results. Here are two teams from the end of last year:
Code:
Rank Team Rutter Rating
4 Boston University 1.2374
5 Wisconsin 1.1882
Based on the point estimates, Boston University is the better team. But look at the estimates of uncertainty.
Code:
Ranking Percentile 25th 75th
Boston University 3 7
Wisconsin 3 7
Looking at the uncertainty in the ratings, at least 50% of the time BU is ranked from 3-7, while Wisconsin is also ranked from 3-7 at least 50% of the time. In my eye, given the the similarity of the final ratings and accounting for the uncertainty, these two teams are essentially equal in ability. If I had to pick one above the other for seeding purposes, the nod would go to BU, but in terms of establishing the quality of the teams they are the same.
Given all the uncertainty in the data and the ability to quantify the uncertainty in my rankings/ratings, I think the effect of the prior by the end of the season for all practical purposes is zero. If my only goal was to seed the teams, I would take it out. But if the prior is enough to change the RANKING at the end of the season, the uncertainty in the data would call for those teams to be equal in terms of RANKING and RATING, which I think is more important.
- Predicting the National Champion
Have you ever noticed that statisticians are never asked to predict the Super Bowl? If asked, they would never say "Buffalo will win." they would say that "Buffalo has a 54% chance to win," which makes horrible TV (please allow me to live in a fantasy world in which the Bills are good).
If you have read the technical documents linked from my web page or have been reading my posts for the past 10+ years, you may know that since that I use a statistical model to create my ratings it is possible to estimate the probability that team A beats team B. To do this, take the rating of team A - team B. This would be the mean of a normal distribution with standard deviation 1. The area under this normal curve that is greater than 0 is the probability that team A will win (ignoring ties in this example).
Therefore, given the NCAA tournament bracket, it would be possible to calculate the probability that each team wins the NCAA championship. Since I am using a statistical model, it is incorrect to say that my ratings/rankings mean that Minnesota will win the tournament. If you use last year's ratings and bracket, assuming that Minnesota would face the teams they were supposed to, you can show that my model says the probability of Minnesota winning the NCAA tournament is 93.7%. If I took the time to do all the calculations, it would be a little higher since I didn't account for any other upsets.
Given that Minnesota was undefeated in the regular season, their Rutter Rating was huge. Even under those circumstances, my model says they had about a 5% chance of not winning. My guess is that if you re-played the NCAA tournament 100 times, Minnesota would probably only win about 80-90 of them, as I think their Rutter Rating was too high. If they had played 30 games against top 10 opponents, they would have not gone undefeated. However, since it was only played once, we will never be able to make that comparison, so talking about how well my model/ratings predict reality is difficult.
If this year my model says the UM has a 38% chance of winning the tournament and Clarkson has a 34% chance, and Clarkson wins, how did my model do? In my day job, I have a model that predicts E. coli at a Presque Isle State Park in Erie, PA. I don't evaluate the accuracy of that model based on one day, I look at the 100 predictions created during the summer. In 40 years, when we can look back at 50 years of my model and see how I am doing.