This paper describes a new ranking system for college hockey based on Poisson Regression. It's not quite perfect yet, but I wanted to get a preliminary version of the paper out for comments. I post it now because there are probabilistic predictions for the upcoming NCAA Tournament contained in it. They differ substantially from the KRACH-based probabilistic projections I posted earlier in the week. I realize the math may be challenging for some, but I've tried to give an introduction that anybody with a familiarity with probability models can understand. I'd love to get comments from anyone about what's in here; what makes sense, what's confusing, and what's flat out wrong. By the way -- "This can't be right... My team is ranked too low" doesn't count as wrong.
Announcement
Collapse
No announcement yet.
A new ranking systen for college hockey
Collapse
X
-
Re: A new ranking systen for college hockey
I think this is awesome and I'll definitely take a longer look at this once I'm finished with my thesis. Maybe this will finally help me understand how to fit my own data with a Poission dist.
One thing I noticed though...I assume this is a first pass that you'll then add addittional parameters. Would the assumption of scoring uniformity be violated for power plays as well? Is there some way you could reasonably factor in the likelihood of a team to take penalties, or prob more importantly get scored on, or score on power plays? I would think given every game is going to have at least 5-6 penalties (or in HE 15), this would be an important factor to include when trying to model a team's performance. This would probably also apply for ENGs, or goals with an extra attacked, but you could prob just subtract those goals for now.
very cool though
Comment
-
Re: A new ranking systen for college hockey
Thanks, Mr. Terrier. I don't think the nonuniformity imposed by penalties is that big of a deal, as long as teams consistently draw or are called for penalties. That just fits into the background rate. You bring up a point that's worth exploring, however, which is whether teams playing interconference games have different scoring or defensive propensities owing to differential probabilities of penalty calls. That's easy to control for.
As to controlling for ENG, extra attacker and PP goals in general though, I am curently limited to my dataset, which only gives the score of every game and whether or not it went to overtime (as well as home or neutral sites). Without parsing every box score, I can't adjust for PP or ENG. But thanks again.
Comment
-
Re: A new ranking systen for college hockey
It sounds fine for trying to predict records, scores, etc. But i wouldn't want to use it to rank teams say for tournament selection.
It shouldn't matter how you win a game, just that you won it.PSNetwork / XBOX GamerTag: xJeris
Steam Profile
Sports Allegiance
NFL: CHI; MLB: MN, NYM; NHL: MN, MTL; NCAAB: MN, UNLV; NCAAF: MN, MIA; NCAAH: MN; Soccer: USA, Blackburn
Comment
-
Re: A new ranking systen for college hockey
I discuss that briefly in the paper, and the NCAA agrees with you. That doesn't enhance my confidence that you're right. I'm trying to find the most powerful teams, irrespective of their record. The fact that these rankings pretty much mirror both KRACH and the Pairwise suggest that this concern is mostly theoretical. But I grant that this system will rank teams which just barely lost a bunch of games much higher than KRACH will. It will also penalize teams that lose badly more than KRACH. But don't you think of two teams playing roughly the same schedule, that the team that won all the games they won by 3 and lost all the games they lost by 1 is just plain more worthy than a team with the same record which barely won in their wins and got killed in their losses?
Comment
-
Re: A new ranking systen for college hockey
Originally posted by BigRedTerrier View PostI think this is awesome and I'll definitely take a longer look at this once I'm finished with my thesis. Maybe this will finally help me understand how to fit my own data with a Poission dist.
One thing I noticed though...I assume this is a first pass that you'll then add addittional parameters. Would the assumption of scoring uniformity be violated for power plays as well? Is there some way you could reasonably factor in the likelihood of a team to take penalties, or prob more importantly get scored on, or score on power plays? I would think given every game is going to have at least 5-6 penalties (or in HE 15), this would be an important factor to include when trying to model a team's performance. This would probably also apply for ENGs, or goals with an extra attacked, but you could prob just subtract those goals for now.
very cool thoughNorth Dakota
National Champions: 1959, 1963, 1980, 1982, 1987, 1997, 2000, 2016
Comment
-
Re: A new ranking systen for college hockey
Originally posted by Fighting Sioux 23 View PostAs far as penalties and powerplay goals and the like is concerned, you could probably use PK and PP percentages somehow. That way, you don't have to go through every single box score.
Also those lines goblue...have you been able to statistically show that your model fits the season results? I may have missed that...apologize if I did.
Comment
-
Re: A new ranking systen for college hockey
Certainly a very interesting creation and a good read. A couple of things, first, if this were to be used as the ranking system for determining playoff teams, how do you account for injuries? Say player A is the leading goal scorer for Team A, but gets injured in the final week of the season and is out for the year. The team's rating is based on results from when he played, but by the rules laid out in this system, the goal scoring probability very likely will change for the tournament games, which is what this system is the end result of this whole idea. Another injury example, what if a goaltender goes down for the year and the backup has to play out the rest of the season. Obviously, the goalie is the single player on the ice who has the most direct impact on the number of goals scored, and could theoretically be out there by himself and keep a team scoreless, thereby rendering the other five skaters pointless from a defensive perspective. Of course this is a wildly irrational scenario, but it underscores the huge effect the goalie has on the outcome of the game. If he goes down, the system has to account for that in some way. I think the point I'm getting at here is a system like this seems like it would be dependent upon who is playing in the game, not just generalized statistics about overall offense and defense. Who is scoring the goals, and who isn't allowing them. I'm not a math whiz, so I may have missed any adjustment for this you may have made, but if it was there, it didn't seem too heavily weighted. If this kind of system were to be used for determining tournament teams, I think there would have to be some accountability for the changes in lines, injuries, goalie tandems, and other player personnel changes made game to game. This may be a reason why the projected records this formula put out were off, which may be a sign that this is incomplete.
Of course the chances of the NCAA adopting something like this are about the same as my chances were earlier today at winning $20,000 on my $2 scratch ticket, but it's still fun to talk about, and I think if you were able to tweak this a bit and be able to produce predictions for previous seasons that were very close to dead-on, this could be something that would be fun to use in the future.time to write new history
Comment
-
Re: A new ranking systen for college hockey
Originally posted by JF_Gophers View PostIt sounds fine for trying to predict records, scores, etc. But i wouldn't want to use it to rank teams say for tournament selection.
It shouldn't matter how you win a game, just that you won it.
When Bill James was still writing Baseball Abstract, he did an entire chapter one year on the significance of how you won, and, in particular, by what margin, that postulated that it was very indicative of what kind of team you are/have. Good teams win by large margins. Bad ones don't. That was the gist of it all. This same article completely pooh-poohed 1 run wins as pretty much meaningless (despite how much you always hear about them in MLB), over the course of baseball history, and he took a look at all of it to formulate that opinion.
It looks like the writer here came to some of the same conclusions that James did, for college hockey, although I readily admit we are talking two entirely different sports here. The parallels to what he said in Baseball Abstract are interesting, though.
Comment
-
Re: A new ranking systen for college hockey
Originally posted by Red Cows View PostI'm not sure I agree with your 2nd sentence.
When Bill James was still writing Baseball Abstract, he did an entire chapter one year on the significance of how you won, and, in particular, by what margin, that postulated that it was very indicative of what kind of team you are/have. Good teams win by large margins. Bad ones don't. That was the gist of it all. This same article completely pooh-poohed 1 run wins as pretty much meaningless (despite how much you always hear about them in MLB), over the course of baseball history, and he took a look at all of it to formulate that opinion.
It looks like the writer here came to some of the same conclusions that James did, for college hockey, although I readily admit we are talking two entirely different sports here. The parallels to what he said in Baseball Abstract are interesting, though.
Quantifying quality of opponent is the hard part of the equation. Over the history of a program, I would agree that margin of victory probably does bare out the quality of a team vs others. But 34 games and a fairly insular scheduling system doesn't allow for that type of analysis to be meaningful in a single given season.
If everyone played everyone else in a single season, then maybe I would lend credence to it. But that 1) is impossible and 2) would never happen even if it was possible.
ETA: I would also be interested to know if this ranking system was reverse engineered. Because there is a built in bias when you a create a system that validates teams you think are good, or have been historically good. This is the same question i've had about the RPI number.
You should never start with "Here are the teams I know are good, now how do I validate that?"Last edited by JF_Gophers; 03-23-2011, 07:37 AM.PSNetwork / XBOX GamerTag: xJeris
Steam Profile
Sports Allegiance
NFL: CHI; MLB: MN, NYM; NHL: MN, MTL; NCAAB: MN, UNLV; NCAAF: MN, MIA; NCAAH: MN; Soccer: USA, Blackburn
Comment
-
Re: A new ranking systen for college hockey
Thanks to all. I'm going to add more to the paper about nonuniformity and controlling for PP and ENG, so I won't add much here. In short, I don't really think it's necessary, but it's worth a look. And thanks for the suggestion of using aggregate PP and PK, FS23. That won't really work for technical reasons, but it helped clarify in my head how to describe the issue.
Slurpees: Any statistics-based model assumes that the model applies for the whole dataset, or explicitly invokes some changing parameter over time. There is no reasonable way in this sort of model to account for injuries, personnel changes, or any of that stuff. It essentially just assumes that this year's history is who you are. Of course, so does every other system we're discussing: KRACH and PWR, for example, though human polls can take account of anything they choose. Obviously, a team that sustains injuries to critical players won't be as good as the power rating under this sort of system indicates. And there's no real way in this sort of model to figure out how much worse. To do that, you'd need a model that worked at the player level, not the team level. There are some diagnostics you could use to see if a team is underperforming relative to the way they performed a month ago, but I don't think that's an ideal use for this kind of model, because humans will spot patterns that are really just random occurrences.
JF_Gophers: First, I can state that the methodology has not been tweaked or adjusted in any way to get a result. And the methodology is so simple (in concept -- to get practical numbers you need a pretty good computer and expensive software) there's really not any scope for doing so. Second, this methodology, without knowing the score of a single game (it only knows what one team scored and who their opponent is) managed to rank the teams in such a way that 13 of the 15 non-playin teams in the tournament were in the top 15. So PWR depends on quality wins and wins and head-to-head comparisons, but it gets almost the same results as a method which doesn't know the complete score of any game. For those who haven't looked at the paper, here are the top 15 teams in rank order:
1. North Dakota
2. Miami
3. Yale
4. Boston College
5. Michigan
6. Nebraska-Omaha
7. Union
8. Notre Dame
9. Denver
10. Wisconsin
11. New Hampshire
12. Minnesota Duluth
13. Merrimack
14. Western Michigan
15. St. Cloud State
For a method that doesn't look at the winner of a single game, that's a pretty good list, IMO, other than Wisconsin, for whom I'm going to add a section in the paper. Other than Air Force, the two teams (CC and RPI) who made it in in lieu of SCS and Wisconsin, are ranked 17th and 20th respectively. Pretty close. And note that this methodology got all four of the top seeds, albeit in a different order.
Finally, Red Cows: I'm in complete agreement, and the Bill James article you cited was an important article to me way back in 1982 when it was published. For those who don't go back that far, that was the year an Atlanta Braves team who wasn't very good the year before rattled off a dozen wins or so to start the season. As that article pointed out (and is fully in the spirit of this methodology) any team can go on a hot streak. To figure out if they're any good, you need to look at whether they're killing people or squeaking by to make an informed judgment. That Braves team, to many people's surprise, ended up winning the National League West that year and the James article was an attempt to retrospectively figure out whether or not we should have figured that out at the time. Thanks for reminding me of it.Last edited by goblue78; 03-23-2011, 08:01 AM.
Comment
-
Re: A new ranking systen for college hockey
goblue78, I still need to read the details of your paper as I have only had a chance to skim it but I like the idea. Before you start tweaking anything, I would STRONGLY suggest looking at multiple seasons and see how well it matches those.
Comment
-
Re: A new ranking systen for college hockey
Originally posted by Khryx View Postgoblue78, I still need to read the details of your paper as I have only had a chance to skim it but I like the idea. Before you start tweaking anything, I would STRONGLY suggest looking at multiple seasons and see how well it matches those.
Comment
-
Re: A new ranking systen for college hockey
You would get similar results just looking at the top 16 team in goals scored average (11/16 teams), and goals against average (11/16).
Top power plays? 9/16. Top penalty kill? Only 6/16.
But I think this proves that score doesn't really matter. Because 15 of the 16 teams that won 20+ games this year are in the tournament. WMU at 19 is the only outlier thats in, and Wisconsin at 21 wins is the only one out.
So scoring a lot of goals and not giving up many leads to more wins, which leads to making the tournament and being ranked highly.
So why bother with scoring, when wins actually gets you a closer match?Last edited by JF_Gophers; 03-23-2011, 08:45 AM.PSNetwork / XBOX GamerTag: xJeris
Steam Profile
Sports Allegiance
NFL: CHI; MLB: MN, NYM; NHL: MN, MTL; NCAAB: MN, UNLV; NCAAF: MN, MIA; NCAAH: MN; Soccer: USA, Blackburn
Comment
Comment