What's new
USCHO Fan Forum

This is a sample guest message. Register a free account today to become a member! Once signed in, you'll be able to participate on this site by adding your own topics and posts, as well as connect with other members through your own private inbox!

  • The USCHO Fan Forum has migrated to a new plaform, xenForo. Most of the function of the forum should work in familiar ways. Please note that you can switch between light and dark modes by clicking on the gear icon in the upper right of the main menu bar. We are hoping that this new platform will prove to be faster and more reliable. Please feel free to explore its features.

A new ranking systen for college hockey

Re: A new ranking systen for college hockey

It has Yale at #3, therefore it needs a lot of work still. No computer in it's right mind would put them at #3. #13 maybe...

Well, KRACH is in it's right mind. And PWR is a little screwy but still has some remaining marbles, as does its component RPI. Computer schemes all, and all generating similar results and all taking into account strength of schedule. At some point, does the evidence convince you at all?

By the way, KRACH gives a probability of Yale winning the tournament at 14.2 percent, while my method gives a probability of 13.0 percent. I believe this method to be substantially more accurate than KRACH for this purpose, but it will be hard, no matter what happens over one tournament, to separate percentage differences of this size. In general, this method lowers the probability of tournament victory for the better teams and raises it for the lower-ranked teams when compared with KRACH.
 
Re: A new ranking systen for college hockey

I disagree with this premise. I think it depends on what you're expecting the rating system to do. Not all systems have to be predicative in nature.

OK, I will concede this point. I guess what I should have said is that I personally am not interested in a rating system that does not predict future results, and that I measure a rating system's usefulness to me by its ability to predict the future. I suppose that if I were given the task of selecting the NCAA tournament field, I would be interested in a "retrodictive" system like KRACH rather than a predictive system like this one.
 
Re: A new ranking systen for college hockey

Right, but then ultimately the W is the most important thing. Bill James simply says a team that wins lots of 1-run games is a statistical aberration and unlikely to continue to do so in the future. He doesn't say that they shouldn't count, however.

Same thing here; all else being equal, a team that goes 19-7-2 with a negative scoring differential is still more deserving of an NCAA bid than a team that goes 12-12-4 with a postivie scoring differential. Doesn't matter that they got lucky or won lots of close games while getting blown out in others; all that matters is that they won.

At some level you're obviously right, and that's why Air Force is in the tournament. In fact, we cherish a system that requires you only to get hot at the end of the year. Look how close Colgate got. But in the real world, you don't see a lot of 19-7-2 teams with negative scoring differentials, particularly when adjusted for who they played. The prrof of this is that I don't need to know anything about wins to figure out who's a good team. Wins are the residue of goal scoring, goal preventing and luck. And there's still a heck of a lot of luck involved and luck counts. Furthermore, over 35 or so games played, luck doesn't have to even out. If you want to interpret it that way, what I'm saying is that Wisconsin is unlucky not to have a better record. And I'm not proposing to replace the NCAA selection process with a mindless application of this method or KRACH. I'm trying to see what you can say about which teams will win games without knowing which teams won games. And it turns out you can say a lot. And furthermore you can say a lot more than you can just by looking at Goals per game and Goals Against per game.

As to its predictive powers, how would you do it, Unofan? Surely you wouldn't just predict that whatever team won the most games in a matchup would be more likely to win the next one. And surely you wouldn't just use net goal differential, either, since otherwise RIT is the 10th best team in the country. How do you adjust for who you've played?
 
Re: A new ranking systen for college hockey

As to its predictive powers, how would you do it, Unofan?

I don't as I'm not in the soothsaying business.

But to the extent you want a real answer, I'd say it depends what you're trying to do.

Trying to predict the winner of a specific matchup? Great, use every available metric out there.

Trying to pick a playoff field? Outcomes of games are paramount while margin of victory is meaningless. To the extent you need to control for SoS in NCAA sports because teams not only don't play everyone, but in fact won't play more teams than they will play, that SoS should be based strictly on pure results too. Playoffs aren't really about getting the 16 strongest or best teams in the field, but instead the 16 most successful teams for a given year. If that means a luckier but weaker team gets in over a better but unlucky one, so be it.
 
Re: A new ranking systen for college hockey

"The best thing I like about these rankings that were churned out is that they almost exactly (with one exception) coincide with my list of the 8 teams I think are in the hunt to win the whole thing. Swap Union and Denver and you're there"---he said, self-servingly.
 
Re: A new ranking systen for college hockey

Well, KRACH is in it's right mind. And PWR is a little screwy but still has some remaining marbles, as does its component RPI. Computer schemes all, and all generating similar results and all taking into account strength of schedule. At some point, does the evidence convince you at all?

Oh evidence always convinces me, just not predictive computer models that are theoretical in nature.
 
Re: A new ranking systen for college hockey

Oh evidence always convinces me, just not predictive computer models that are theoretical in nature.

What sort of non-theoretical predictive model did you have in mind? Alphabetical order of teamnames? Inverse age of school? Nickname competition?
 
Re: A new ranking systen for college hockey

What sort of non-theoretical predictive model did you have in mind?

There is a story, perhaps apocryphal, of a grade-school student who nearly predicted a perfect NCAA men's division one basketball tournament bracket. His method? He had the team mascots face off in a fantasy challenge and picked the winners that way (you know, a Jayhawk would peck out the eyes of a Husky but it would get trapped by a Pioneer, that kind of thing).

On a more serious note, I'm not sure if I can articulate my question in words instead of math...isn't it the case that most statistical models have a likelihood, a margin of error, and a time frame? Wouldn't the number of playoff games in the tournament make a difference?

in other words, wouldn't there be a significant difference in the effectiveness of your model in a best of seven series (all those references to Bill James earlier) compared to a one-and-done tournament? (like a basketball game in which a referee has trouble counting to five?)
 
Re: A new ranking systen for college hockey

Ok, so here's my input.
It seems to me that a lot of games are effected by the ENG at the end. I know you said that ENG's are a special case but it really seems to me that the margin of victory is doubled, or nearly so, in an awful lot of games. So I wonder if this is really accounted for correctly.
Second I'd be curious if you dropped the top 10% of goal differential games and the bottom ten % if you'd get the same result. I agree with the NCAA to some extent that margin of victory, though an indicator, isn't as important as the win.
 
Re: A new ranking systen for college hockey

http://it.stlawu.edu/~chodr/

Have you looked at the above site? I'm on my mobile and haven't looked at your paper but it sounds like what SLU's Robin Locke has been doing for years.

I agree these rankings are more helpful for predictions. But there's zero chance a ranking using data other than wins and losses and ties from the current season ever gets adopted by the NCAA. This is a philosophical issue, not a mathematical one, and the NCAA has made it's philosophy very clear through the kinds of selection criteria used across a wide range of sports.

I see no philosophical barrier to the NCAA one day adopting KRACH instead of RPI however, though if anyone disagrees please enlighten me.
 
Re: A new ranking systen for college hockey

ok just read more of the thread and you clearly are more interested in being predictive
 
Re: A new ranking systen for college hockey

There is a story, perhaps apocryphal, of a grade-school student who nearly predicted a perfect NCAA men's division one basketball tournament bracket. His method? He had the team mascots face off in a fantasy challenge and picked the winners that way (you know, a Jayhawk would peck out the eyes of a Husky but it would get trapped by a Pioneer, that kind of thing).

On a more serious note, I'm not sure if I can articulate my question in words instead of math...isn't it the case that most statistical models have a likelihood, a margin of error, and a time frame? Wouldn't the number of playoff games in the tournament make a difference?

in other words, wouldn't there be a significant difference in the effectiveness of your model in a best of seven series (all those references to Bill James earlier) compared to a one-and-done tournament? (like a basketball game in which a referee has trouble counting to five?)

Picking out an NCAA pool is a bit of knowledge and a bit of luck... the Facebook NCAA pools are great for that... some girl at some D1 school goes 32/32 is "oh, wow, I am so awesome"... now, doing this consistently over the course of the year is another matter. You know what the (generally) statistically optimal thing to do is? Pass the higher seeds nearly without question... every once in awhile you'll have a low-balled low seed (Winthrop a couple of years ago... College of Charleston, IIRC, 12 years back or so) but that #2 is more likely to be knocked out by somebody than that #1 nearly every time... note, more likely.... #11 George Mason and #11 Villanova speak to the reliability of this statement.

A few of you recall that there was a "guess the score" website that I'll bet got C&D on gambling grounds... a third of the season I was sitting in 2nd place out of everybody... and I am near to certain I would win. Why? Well, when most people guess the score they go with their hearts... pick upsets... 5-1, 2-0, 1-0... etc. See, while the certain scores were prioritized (shutouts) they turn out to be a lot less likely than the reward given for correctly predicting one... this wasn't totally the case. So, what I did for the first couple of weeks was pick the winner and guess 3-2... why? Because that's the most likely result... technically 3-3 is but with ties this makes this result a lot less likely... IIRC you may have gotten points for wins as well... wins and ties and score in the formulation means its better to go for wins and losses... who cares of your gut says 1-1... laws of large numbers means I can give every game a 3-2 guess... and I'll do better in the long haul. I think I had a Poisson model in place along the line of Robin Lock's CHODR... I had adjusted for overtime time (hell, i can code that quick now given data)... Robin Lock does not adjust for overtime length (IIRC) and negates ENGs from the model.

Too bad, I missed my opportunity on a hockey jersey.

Anyhow... I'll try to look into the model... my brain is so fried from work, while I'll want to read I'm not sure how much I really want to extract, carp, and snipe right now... because I am a detailed opinionated bastard.

---

All that being said, I believe Poisson score models have a strong predictive power in Ice Hockey... one could argue about the applicability of the Poisson, effects of extra-man, PP, etc., etc., etc. but its fairly useful as anything else. That being said, I tend to look at GF-GA differences for measures of strength... I can plug in certain Poisson parameters for goal scoring rates and the margin of separation that does not occur would amaze you... you can be demonstrably better and still lose a game 10-20% of the time... I've drawn entire NHL seasons based on the Poisson distribution and I've seen teams fluctuate by 20-30 points from draw to draw... the only consistent things were the top teams and even those sometimes missed the playoffs in the sim. So, for as much as it is predictive, on any given day things happen. Now, in college the talent is a lot more separable... but the behavior is the behavior.

edit: at this point, having done a lot of "small area estimation" at work... if I were to punt out a model... seeing that I understand Mr. Rutter's work a bit more (setting up a random effects model on the Bradley-Terry formulation)... I'd see if I could set up something similar on the score model.... of course I have no real desire to put myself through such a model build up when there's meager reward as a non-academic who wants to do things on the weekend.... it'd set up some sort of possibly correlating random-effects on the Offensive/Defensive parameters, etc. This all goes into my hypothetical dream world Pairwise Simulator that would sim out the season remaining flush with accurate tie-breaks.

edit #2: I have only skimmed the paper... I don't have the desire to cross all the t's and dot the i's... that being said... if there's a special "in conference" effect, the biggest issue I would have is that I would want to see the necessity for the term tested (classical fashion, frequentist paradigm likelihood ratio test)... if only because one does have a null model of sorts that can be applied.

I don't want to critique every part of your paper... I remember being interested in this since high school through undergrad and into grad school with some really iffy models used from other people's code... but the biggest issue with the BCS Score Based formulas is more that people are aware of them and how to manipulate them... think of some of the comments from Jurassic Park made by Goldblum's character... the method is also part of the system (this also affected quantitative investing as demonstrated by MIT's Andrew Lo)... if the football people were ignorant of the rankings and the advantages then there wouldn't be an outright problem... the flaw isn't completely in the methodology... but then again it is... harder to cheat a KRACH model during the game (scheduling, could be argued though). The problem in the end was the gaming of the system and not the impact of the model in itself.
 
Last edited:
Re: A new ranking systen for college hockey

On a more serious note, I'm not sure if I can articulate my question in words instead of math...isn't it the case that most statistical models have a likelihood, a margin of error, and a time frame? Wouldn't the number of playoff games in the tournament make a difference?

in other words, wouldn't there be a significant difference in the effectiveness of your model in a best of seven series (all those references to Bill James earlier) compared to a one-and-done tournament? (like a basketball game in which a referee has trouble counting to five?)

It depends on what you mean by "effective." For example, if I say the model predicts that Team A has a 40 percent chance of winning against Team B, and they win, does that mean the model was wrong? Of course not. What is important is that when you make 10 such predictions, the underdogs only win 4 times. Of course, even there you wouldn't be surprised to get 5 out of ten, but you'd be really surprised to get 70 out of 100 such predictions. Particularly when yo're down to good teams, as you get in the tournament, 70-30 is about as lopsided as a matchup can get. That said, it's a lot more likely for UND to win the tournament than Air Force... about 14 times more likely. But even UND only has about one chance in seven of winning.

The more times a 30 percent team plays a 70 percent team, the more likely these percentages can operate. The chance that a 30 percent team wins a one-and-done is 30 percent. The chance they win a two-out-of-three is 21.6 percent. The chance they win a three out of five is 16.3 percent. And the chance they win a four-out-of-seven is down to 12.6 percent.
 
Re: A new ranking systen for college hockey

Ok, so here's my input.
It seems to me that a lot of games are effected by the ENG at the end. I know you said that ENG's are a special case but it really seems to me that the margin of victory is doubled, or nearly so, in an awful lot of games. So I wonder if this is really accounted for correctly.
Second I'd be curious if you dropped the top 10% of goal differential games and the bottom ten % if you'd get the same result. I agree with the NCAA to some extent that margin of victory, though an indicator, isn't as important as the win.

I think you're right. The way I know you're right is that good teams and bad teams have too few ties -- bad teams manage to turn those games into losses and good teams manage to turn them into wins. I suspect the last five minutes of the game has a lot to do with that. I'm working on some changes to fix that now. Thanks.
 
Re: A new ranking systen for college hockey

Dave1381: Thanks... I'll check that out. It's very similar. I'll try and contact Robin directly.. .Thanks.
Patman: thanks. You're rambling, but they seem to be informed ramblings. I'll try and parse them out later.
 
Last edited:
Re: A new ranking systen for college hockey

What sort of non-theoretical predictive model did you have in mind? Alphabetical order of teamnames? Inverse age of school? Nickname competition?

I didn't have any in mind. You asked when evidence convinced me. I stated always. Clearly a theoretical model is not evidence.
 
Re: A new ranking systen for college hockey

The more times a 30 percent team plays a 70 percent team, the more likely these percentages can operate. The chance that a 30 percent team wins a one-and-done is 30 percent. The chance they win a two-out-of-three is 21.6 percent. The chance they win a three out of five is 16.3 percent. And the chance they win a four-out-of-seven is down to 12.6 percent.

Understand that part; however, the deeper you go into the tournament, the smaller the differential between teams, while also several probability waves from the earlier rounds collapse to either 100% or 0%.

For the first effect, ignoring the collapse of probability waves, you might see 70% vs 30% in round one, 62 % vs 38% in round two, and be down to 55% vs 45% by the finals, which basically would mean that in a one-game playoff the finals would be a tossup, no?

and for the second effect, I can't remember any more how that would work. If you ever read the story Flowers for Algernon in secondary school, you might emphasize...I found my quantum mechanics notebook from my college days when we were moving, and I leafed through it. I recognized my handwriting, and the matrix notation we used in one section of the class (I always liked the wave equation better from an intuitive perspective though the math of the matrix formulation was much easier to manipulate), but I could not remember what the matrix notation meant any more.
 
Re: A new ranking systen for college hockey

Oh, now I see what you meant. There are two effects going on. As you move deeper into a tournament, possibilities that were there before are resolved into certainties. This will always raise the probability of winning, conditional on still being in the tournament. The second effect is that, on average, you face better teams, although in certain circumstances you might end up playing a weaker team than you thought you would. This effect can go either way, but, on average, this effect reduces your chance of winning. Here is the probability table from the paper:
Code:
		Round 1	Round 2	Round 3	Round 4
North Dakota	0.674691	0.423935	0.242492	0.137726
Yale		0.738561	0.429236	0.236398	0.130195
Miami		0.617677	0.37245	0.215619	0.122648
Boston College	0.600891	0.316406	0.16827	0.087758
Michigan		0.537063	0.280971	0.146949	0.075155
Union		0.597741	0.295489	0.139879	0.066272
Nebraska-Omaha	0.462937	0.229762	0.113909	0.055675
Denver		0.539458	0.241392	0.112702	0.052304
Notre Dame	0.505198	0.220852	0.109532	0.050669
Merrimack		0.494802	0.216963	0.101212	0.046103
New Hampshire	0.382323	0.189734	0.092202	0.04187
Western Michigan	0.460542	0.182827	0.077957	0.033484
Minnesota-Duluth	0.402259	0.187352	0.077335	0.033135
Colorado College	0.399109	0.172862	0.075728	0.032521
Rensselaer	0.325309	0.151846	0.061994	0.025776
Air Force		0.261439	0.087923	0.027822	0.00871

Every team has a lower probability of surviving from one round to the next: that's because they might lose in any round. Suppose the favorites (by this sytem) win all the the first round games, however: then the revised table would look like this:
Code:
		Round 1	Round 2		Round 3		Round 4
Yale		1	0.5700341		0.29754198	0.156266223
Union		1	0.4299659		0.191234255	0.085357656
Notre Dame	1	0.3973444		0.179684486	0.078318453
Miami		1	0.6026556		0.331539279	0.179814901
North Dakota	1	0.6053119		0.332986978	0.181998721
Denver		1	0.3946881		0.175195194	0.076800695
Michigan		1	0.486122		0.236169887	0.114672091
Boston College	1	0.513878		0.255647941	0.126771259

In the last round, you just have a head-to-head contest between two teams. As I said above, there's no way that will be a bigger disparity than about 70-30 (if Air Force survives to the final to play BC). More generally, you get two high-ranked teams playing, and the probabilities will be close to 50-50. By the way, this brings up the important point that we don't generally use the evidence from the tournament itself to adjust the odds, though of course we could. If Air Force beats, say, Yale, Union and Notre Dame in succession to reach the final, then they're probably a much better team than now than they have shown in the past. if you re-0estimate, this will again tighten the odds.
 
Back
Top