What's new
USCHO Fan Forum

This is a sample guest message. Register a free account today to become a member! Once signed in, you'll be able to participate on this site by adding your own topics and posts, as well as connect with other members through your own private inbox!

  • The USCHO Fan Forum has migrated to a new plaform, xenForo. Most of the function of the forum should work in familiar ways. Please note that you can switch between light and dark modes by clicking on the gear icon in the upper right of the main menu bar. We are hoping that this new platform will prove to be faster and more reliable. Please feel free to explore its features.

Division I Rutter Rankings for 2013-2014

Re: Division I Rutter Rankings for 2013-2014

Not sure there is any value in having a carry-over effect unless you are trying to determine the best program, not team, of the past couple few years.

Despite a bachelor's degree in statistics I have almost no training in Bayesian methods but my understanding is that the priors are always still in the formula. Depending upon how you do it, the more recent data is more heavily weighted and over time the weighting of the priors decreases until they are asymptotically close to zero. So they have no measurable influence on the output but they are still there.

So my question is really one of curiosity about statistical method rather than trying to figure out what the current values mean.
 
Re: Division I Rutter Rankings for 2013-2014

Despite a bachelor's degree in statistics I have almost no training in Bayesian methods but my understanding is that the priors are always still in the formula. Depending upon how you do it, the more recent data is more heavily weighted and over time the weighting of the priors decreases until they are asymptotically close to zero. So they have no measurable influence on the output but they are still there.

So my question is really one of curiosity about statistical method rather than trying to figure out what the current values mean.

a better question might be "can a mathematical model predict human behavioral outcomes?"
Google "Hull"
no, not Brett or Bobby, Clark, professor at Yale

You guys were really, really scratching hard for points to argue Yesterday.
When someone wins an NCAA championship, it’s like winning the Superbowl, the World Serise, or a Stanley Cup, the ultimate achievement in that sport. If the players on the winning team want to call themselves the best, nobody on the other teams are going to argue, although some of the nimrod fans will.

Given that the ranking and seeding in the NCAA tournament favors non-WCHA teams, it only makes the achievement of UMD, Wisconsin, & Minnesota all the more remarkable. If I were a player on one of the other teams I’d be insulted that people seem to think I need an advantage to compete with one of those teams. Eventually a non-WCHA team will win the NCAA tournament. At that point it will have to be determined “was the championship won fairly?” & “how much longer should this advantage continue?”
 
Re: Division I Rutter Rankings for 2013-2014

When someone wins an NCAA championship, it’s like winning the Superbowl, the World Serise, or a Stanley Cup, the ultimate achievement in that sport.
Correct. But "winning the championship" and "being the "best" (insofar as being the "most talented") are not necessarily the same things.

If the players on the winning team want to call themselves the best, nobody on the other teams are going to argue, although some of the nimrod fans will.
I will ask you again. If BC had put in a goal in overtime against Minnesota last year, would you tell me that Minnesota was not the best team last year?

Given that the ranking and seeding in the NCAA tournament favors non-WCHA teams, it only makes the achievement of UMD, Wisconsin, & Minnesota all the more remarkable. If I were a player on one of the other teams I’d be insulted that people seem to think I need an advantage to compete with one of those teams.
Advantage? What?

What the actual Christ are you even talking about at this point?

Eventually a non-WCHA team will win the NCAA tournament. At that point it will have to be determined “was the championship won fairly?” & “how much longer should this advantage continue?”
You have completely and utterly gone off the deep end, buddy.
 
Re: Division I Rutter Rankings for 2013-2014

Correct. But "winning the championship" and "being the "best" (insofar as being the "most talented") are not necessarily the same things.

That maybe so, but years later, typically the only thing that is remembers by most is "Who won the Championship".

Unless it involves the highway Robbery act of the McGill team "robbing" the Eagles from their undefeated season back in the 40's. :D
 
Re: Division I Rutter Rankings for 2013-2014

That maybe so, but years later, typically the only thing that is remembers by most is "Who won the Championship".
I get this but that is not the point.

Ranking systems aren't trying to predict the champion. They are trying to predict who has the best chance to be the champion, i.e., who is the best team.

There is a subtle difference, but a real one.
 
I get this but that is not the point.

Ranking systems aren't trying to predict the champion. They are trying to predict who has the best chance to be the champion, i.e., who is the best team.

There is a subtle difference, but a real one.

No, rankings try to establish the pecking order based on a set of criteria to date.
 
Re: Division I Rutter Rankings for 2013-2014

That maybe so, but years later, typically the only thing that is remembers by most is "Who won the Championship".

In most years that's true. But it's also the case that in most years the team that wins the championship could have built a legitimate case, though often not an overwhelming one, as being the best team before the tournament started. The occasional years in which that's not the case provide an interesting test.

For example, in men's NCAA ice hockey, I suspect that the 1996-97 Michigan Wolverines are still better remembered than the 1996-97 North Dakota Sioux despite the fact that Michigan lost in the semis and North Dakota won the title.
 
Re: Division I Rutter Rankings for 2013-2014

Just by way of information, it appears you have a missing or broken link (and possibly a typo) on your D1 front page.

The last sentence of the third paragraph reads "For information of the prior distributions, click here", but there is nothing to click at "here".

Apparently, that was something I was going to add when I first put the page up and forgot about. I have added it to my "to-do" list. Thanks for pointing that out.
 
Re: Division I Rutter Rankings for 2013-2014

OK, a couple of points have been brought up that I am happy to address.

- The Bayesian Prior

There is nothing in my model that turns off the Bayesian prior after X number of games. That was a decision on my part based on "Bayesian Philosophy," if you will. By the end of the season, the current seasons data has way more influence on the rating than the prior based on previous seasons. So by the end of the season, the only time the prior would have any effect would be when two teams have very similar ratings based on the current season's data. At that point the prior would give the team that had a better rating last season a higher rating.

From a pure statistical standpoint, the effects of the prior are roughly equivalent of winning half a game against an opponent equal to the team's rating from last year.

On the surface, this may seem troubling. But I think it also exposes a flaw in all ranking systems. We are trying to estimate a team's true ability (what I call rating) based on a limited number of observations in which there is "observation error" (hockey games are close, and the team with the better ability does not always win) and "process error" (a team's ability is not constant, be it due to injuries, things going on at school, etc.). So trying to estimate a team's true ability based on thirty or so binary responses in the presence of all this uncertainty is a difficult task.

The fact that we say team A is better than team B based on a single point estimate based is a little troubling. As a statistician, I would never say drug A is better than drug B based on point estimates. We look at estimates of uncertainty and confidence intervals as well. That is why I think my "uncertainty page" is by far the most important part of my results. Here are two teams from the end of last year:
Code:
Rank   Team               Rutter Rating
4 	   Boston University 	1.2374 	
5 	   Wisconsin 	        1.1882

Based on the point estimates, Boston University is the better team. But look at the estimates of uncertainty.

Code:
Ranking Percentile      25th 75th 
Boston University 	3 	7 	
Wisconsin 	        3 	7

Looking at the uncertainty in the ratings, at least 50% of the time BU is ranked from 3-7, while Wisconsin is also ranked from 3-7 at least 50% of the time. In my eye, given the the similarity of the final ratings and accounting for the uncertainty, these two teams are essentially equal in ability. If I had to pick one above the other for seeding purposes, the nod would go to BU, but in terms of establishing the quality of the teams they are the same.

Given all the uncertainty in the data and the ability to quantify the uncertainty in my rankings/ratings, I think the effect of the prior by the end of the season for all practical purposes is zero. If my only goal was to seed the teams, I would take it out. But if the prior is enough to change the RANKING at the end of the season, the uncertainty in the data would call for those teams to be equal in terms of RANKING and RATING, which I think is more important.

- Predicting the National Champion

Have you ever noticed that statisticians are never asked to predict the Super Bowl? If asked, they would never say "Buffalo will win." they would say that "Buffalo has a 54% chance to win," which makes horrible TV (please allow me to live in a fantasy world in which the Bills are good).

If you have read the technical documents linked from my web page or have been reading my posts for the past 10+ years, you may know that since that I use a statistical model to create my ratings it is possible to estimate the probability that team A beats team B. To do this, take the rating of team A - team B. This would be the mean of a normal distribution with standard deviation 1. The area under this normal curve that is greater than 0 is the probability that team A will win (ignoring ties in this example).

Therefore, given the NCAA tournament bracket, it would be possible to calculate the probability that each team wins the NCAA championship. Since I am using a statistical model, it is incorrect to say that my ratings/rankings mean that Minnesota will win the tournament. If you use last year's ratings and bracket, assuming that Minnesota would face the teams they were supposed to, you can show that my model says the probability of Minnesota winning the NCAA tournament is 93.7%. If I took the time to do all the calculations, it would be a little higher since I didn't account for any other upsets.

Given that Minnesota was undefeated in the regular season, their Rutter Rating was huge. Even under those circumstances, my model says they had about a 5% chance of not winning. My guess is that if you re-played the NCAA tournament 100 times, Minnesota would probably only win about 80-90 of them, as I think their Rutter Rating was too high. If they had played 30 games against top 10 opponents, they would have not gone undefeated. However, since it was only played once, we will never be able to make that comparison, so talking about how well my model/ratings predict reality is difficult.

If this year my model says the UM has a 38% chance of winning the tournament and Clarkson has a 34% chance, and Clarkson wins, how did my model do? In my day job, I have a model that predicts E. coli at a Presque Isle State Park in Erie, PA. I don't evaluate the accuracy of that model based on one day, I look at the 100 predictions created during the summer. In 40 years, when we can look back at 50 years of my model and see how I am doing.
 
Re: Division I Rutter Rankings for 2013-2014

OK, a couple of points have been brought up that I am happy to address.

- The Bayesian Prior

There is nothing in my model that turns off the Bayesian prior after X number of games. That was a decision on my part based on "Bayesian Philosophy," if you will. By the end of the season, the current seasons data has way more influence on the rating than the prior based on previous seasons. So by the end of the season, the only time the prior would have any effect would be when two teams have very similar ratings based on the current season's data. At that point the prior would give the team that had a better rating last season a higher rating.

From a pure statistical standpoint, the effects of the prior are roughly equivalent of winning half a game against an opponent equal to the team's rating from last year.

On the surface, this may seem troubling. But I think it also exposes a flaw in all ranking systems. We are trying to estimate a team's true ability (what I call rating) based on a limited number of observations in which there is "observation error" (hockey games are close, and the team with the better ability does not always win) and "process error" (a team's ability is not constant, be it due to injuries, things going on at school, etc.). So trying to estimate a team's true ability based on thirty or so binary responses in the presence of all this uncertainty is a difficult task.

The fact that we say team A is better than team B based on a single point estimate based is a little troubling. As a statistician, I would never say drug A is better than drug B based on point estimates. We look at estimates of uncertainty and confidence intervals as well. That is why I think my "uncertainty page" is by far the most important part of my results. Here are two teams from the end of last year:
Code:
Rank   Team               Rutter Rating
4 	   Boston University 	1.2374 	
5 	   Wisconsin 	        1.1882

Based on the point estimates, Boston University is the better team. But look at the estimates of uncertainty.

Code:
Ranking Percentile      25th 75th 
Boston University 	3 	7 	
Wisconsin 	        3 	7

Looking at the uncertainty in the ratings, at least 50% of the time BU is ranked from 3-7, while Wisconsin is also ranked from 3-7 at least 50% of the time. In my eye, given the the similarity of the final ratings and accounting for the uncertainty, these two teams are essentially equal in ability. If I had to pick one above the other for seeding purposes, the nod would go to BU, but in terms of establishing the quality of the teams they are the same.

Given all the uncertainty in the data and the ability to quantify the uncertainty in my rankings/ratings, I think the effect of the prior by the end of the season for all practical purposes is zero. If my only goal was to seed the teams, I would take it out. But if the prior is enough to change the RANKING at the end of the season, the uncertainty in the data would call for those teams to be equal in terms of RANKING and RATING, which I think is more important.

- Predicting the National Champion

Have you ever noticed that statisticians are never asked to predict the Super Bowl? If asked, they would never say "Buffalo will win." they would say that "Buffalo has a 54% chance to win," which makes horrible TV (please allow me to live in a fantasy world in which the Bills are good).

If you have read the technical documents linked from my web page or have been reading my posts for the past 10+ years, you may know that since that I use a statistical model to create my ratings it is possible to estimate the probability that team A beats team B. To do this, take the rating of team A - team B. This would be the mean of a normal distribution with standard deviation 1. The area under this normal curve that is greater than 0 is the probability that team A will win (ignoring ties in this example).

Therefore, given the NCAA tournament bracket, it would be possible to calculate the probability that each team wins the NCAA championship. Since I am using a statistical model, it is incorrect to say that my ratings/rankings mean that Minnesota will win the tournament. If you use last year's ratings and bracket, assuming that Minnesota would face the teams they were supposed to, you can show that my model says the probability of Minnesota winning the NCAA tournament is 93.7%. If I took the time to do all the calculations, it would be a little higher since I didn't account for any other upsets.

Given that Minnesota was undefeated in the regular season, their Rutter Rating was huge. Even under those circumstances, my model says they had about a 5% chance of not winning. My guess is that if you re-played the NCAA tournament 100 times, Minnesota would probably only win about 80-90 of them, as I think their Rutter Rating was too high. If they had played 30 games against top 10 opponents, they would have not gone undefeated. However, since it was only played once, we will never be able to make that comparison, so talking about how well my model/ratings predict reality is difficult.

If this year my model says the UM has a 38% chance of winning the tournament and Clarkson has a 34% chance, and Clarkson wins, how did my model do? In my day job, I have a model that predicts E. coli at a Presque Isle State Park in Erie, PA. I don't evaluate the accuracy of that model based on one day, I look at the 100 predictions created during the summer. In 40 years, when we can look back at 50 years of my model and see how I am doing.
This post made my math major heart happy
 
Re: Division I Rutter Rankings for 2013-2014

Who says the polls don't mean much? If nothing else they sure do generate a lot of controversy! ;)
 
Re: Division I Rutter Rankings for 2013-2014

Well, BU lost to an unranked RMU 3-0 so that will def mix things up.
 
Re: Division I Rutter Rankings for 2013-2014

Have you ever noticed that statisticians are never asked to predict the Super Bowl? If asked, they would never say "Buffalo will win." they would say that "Buffalo has a 54% chance to win,"
...
Given that Minnesota was undefeated in the regular season, their Rutter Rating was huge. ... If they had played 30 games against top 10 opponents, they would have not gone undefeated.
While the final sentence that I have quoted may be true, as a statistician, you are unable to make an absolute statement declaring it to be true. This is your final top 10 from last season:
Final rankings for the 2012-2013 season

Code:
  	Team 		Rating	Last Week
1 	Minnesota 	3.33	1
2 	Cornell 	1.42	2
3 	North Dakota 	1.33	3
4 	Boston Univ. 	1.24	6
5 	Wisconsin 	1.19	4
6 	Boston College 	1.14	5
7 	Harvard 	1.12	7
8 	Clarkson 	1.00	8
9 	Mercyhurst 	0.81	9
10 	UMD		0.72	10
Minnesota did go 16-0 versus your top 10, including 12-0 against your top 6. Saying that they would lose if you added 14 more games against the top 10 could possibly be true. I do know enough math to know that the Gophers' chance of winning all 14 of those games would be greater than 0% in your model.
 
Re: Division I Rutter Rankings for 2013-2014

While the final sentence that I have quoted may be true, as a statistician, you are unable to make an absolute statement declaring it to be true.

Minnesota did go 16-0 versus your top 10, including 12-0 against your top 6. Saying that they would lose if you added 14 more games against the top 10 could possibly be true. I do know enough math to know that the Gophers' chance of winning all 14 of those games would be greater than 0% in your model.

ARM, you a quite right. As a statistician, I can say that if the UM played 14 more games against opponents with an average Rutter Rating of 1.2, they would have roughly an 80% chance of winning those games. I used a poor argument as to why I think a 95% chance of winning the tournament is too high.

I think this is a better one: tournament hockey is different than the regular season. I'll give two "mechanical" reasons. The number of penalties called in the post season is typically less than in the regular season is one difference. The second difference is sudden-death overtime, which is a different beast. If a great team gives up a goal earlier in a game, they have the rest of the game to come back. If they give up a goal early in sudden-death overtime, they are out.

Due to a lack of data (only seven games a year), it is difficult to prove that the regular season model overestimates the probability of the higher rated team winning in the post season. The hockey "expert" part of me says that Minnesota winning 95 out of 100 simulated NCAA tournaments is too high, given that games tend to be closer in the post-season. One way to argue that is to say the Minnesota's 2012-13 Rutter Ranking is inflated, which is purely opinion. The other way is to say that model used for regular season ratings is not applicable to tournament hockey, which is something that I may be able to show when a large number of NCAA tournament games have been played over a number of years.
 
Re: Division I Rutter Rankings for 2013-2014

OK, a couple of points have been brought up that I am happy to address.

- The Bayesian Prior

There is nothing in my model that turns off the Bayesian prior after X number of games. That was a decision on my part based on "Bayesian Philosophy," if you will. By the end of the season, the current seasons data has way more influence on the rating than the prior based on previous seasons. So by the end of the season, the only time the prior would have any effect would be when two teams have very similar ratings based on the current season's data. At that point the prior would give the team that had a better rating last season a higher rating.

From a pure statistical standpoint, the effects of the prior are roughly equivalent of winning half a game against an opponent equal to the team's rating from last year.

On the surface, this may seem troubling. But I think it also exposes a flaw in all ranking systems. We are trying to estimate a team's true ability (what I call rating) based on a limited number of observations in which there is "observation error" (hockey games are close, and the team with the better ability does not always win) and "process error" (a team's ability is not constant, be it due to injuries, things going on at school, etc.). So trying to estimate a team's true ability based on thirty or so binary responses in the presence of all this uncertainty is a difficult task.

The fact that we say team A is better than team B based on a single point estimate based is a little troubling. As a statistician, I would never say drug A is better than drug B based on point estimates. We look at estimates of uncertainty and confidence intervals as well. That is why I think my "uncertainty page" is by far the most important part of my results. Here are two teams from the end of last year:
Code:
Rank   Team               Rutter Rating
4 	   Boston University 	1.2374 	
5 	   Wisconsin 	        1.1882

Based on the point estimates, Boston University is the better team. But look at the estimates of uncertainty.

Code:
Ranking Percentile      25th 75th 
Boston University 	3 	7 	
Wisconsin 	        3 	7

Looking at the uncertainty in the ratings, at least 50% of the time BU is ranked from 3-7, while Wisconsin is also ranked from 3-7 at least 50% of the time. In my eye, given the the similarity of the final ratings and accounting for the uncertainty, these two teams are essentially equal in ability. If I had to pick one above the other for seeding purposes, the nod would go to BU, but in terms of establishing the quality of the teams they are the same.

Given all the uncertainty in the data and the ability to quantify the uncertainty in my rankings/ratings, I think the effect of the prior by the end of the season for all practical purposes is zero. If my only goal was to seed the teams, I would take it out. But if the prior is enough to change the RANKING at the end of the season, the uncertainty in the data would call for those teams to be equal in terms of RANKING and RATING, which I think is more important.

- Predicting the National Champion

Have you ever noticed that statisticians are never asked to predict the Super Bowl? If asked, they would never say "Buffalo will win." they would say that "Buffalo has a 54% chance to win," which makes horrible TV (please allow me to live in a fantasy world in which the Bills are good).

If you have read the technical documents linked from my web page or have been reading my posts for the past 10+ years, you may know that since that I use a statistical model to create my ratings it is possible to estimate the probability that team A beats team B. To do this, take the rating of team A - team B. This would be the mean of a normal distribution with standard deviation 1. The area under this normal curve that is greater than 0 is the probability that team A will win (ignoring ties in this example).

Therefore, given the NCAA tournament bracket, it would be possible to calculate the probability that each team wins the NCAA championship. Since I am using a statistical model, it is incorrect to say that my ratings/rankings mean that Minnesota will win the tournament. If you use last year's ratings and bracket, assuming that Minnesota would face the teams they were supposed to, you can show that my model says the probability of Minnesota winning the NCAA tournament is 93.7%. If I took the time to do all the calculations, it would be a little higher since I didn't account for any other upsets.

Given that Minnesota was undefeated in the regular season, their Rutter Rating was huge. Even under those circumstances, my model says they had about a 5% chance of not winning. My guess is that if you re-played the NCAA tournament 100 times, Minnesota would probably only win about 80-90 of them, as I think their Rutter Rating was too high. If they had played 30 games against top 10 opponents, they would have not gone undefeated. However, since it was only played once, we will never be able to make that comparison, so talking about how well my model/ratings predict reality is difficult.

If this year my model says the UM has a 38% chance of winning the tournament and Clarkson has a 34% chance, and Clarkson wins, how did my model do? In my day job, I have a model that predicts E. coli at a Presque Isle State Park in Erie, PA. I don't evaluate the accuracy of that model based on one day, I look at the 100 predictions created during the summer. In 40 years, when we can look back at 50 years of my model and see how I am doing.

saying Buffalo has a 54% chance to win IS making a prediction
 
Re: Division I Rutter Rankings for 2013-2014

When I hear the weatherman say there's a 70% chance of rain, is that a prediction? If not they are never wrong!

(but maybe that's part of the science)
 
Back
Top