Patman
Rodent of Unusual Size
Lowell hasn't played BU yet.
Alright, I thought we had played a game
Lowell hasn't played BU yet.
I can simulate everything except across the conference tournament break. So, for now I simulate up to the end of the regular season, then once the tournaments are seeded I can simulate to the end of the conference tournaments.I think once we get past the Monday tournament Sioux sports does sims from remaing regular season games.
If I had my way I'd have a giant simulator available earlier in the season a la baseball prospectus... But schedules are fluid because of tournaments and tie-breakers are hard for this "as I need to" programmer.
Thanks for your analysis, "just win baby!' helps in all cases as well.Yale's heavily insular schedule does make them a very interesting case.
Add in the fact that 8 of Yale's 13 remaining regular season games are against current TUCs (and three more against Princeton and Rensselaer, who are each within 0.015 RPI of being a TUC) and it's practically impossible to make any definitive claims about them.
The best advice for Yale going forward is to continue winning the big games. The losses to RPI and Holy Cross may be maddening as a fan, but they are great for the team as explained a couple of seasons ago by Scott Brown. The main point, though is that only 5 of Yale's 16 games so far have been played against TUCs and 8 of their remaining 13 will be. So, really, we just don't know enough about Yale's ability to make any statements on how strong or weak they are compared to their current Pairwise placement.
Thanks for your analysis, "just win baby!' helps in all cases as well.
Not always.
I was just trying to look at their schedule going ahead and how heavily back-loaded it is.Burgie12: You're right about Yale and its TUC games, but you left out the games that have been already played that might become TUC games again if Holy Cross, UMass and Colorado College creep back in... they're even closer than RPI and Princeton.
Even if Holy Cross does slip back into the TUCs, it's just one loss. In order for Yale to stay in the field, they'll have to keep their RPI high. To do that, they're going to need a lot of wins down the stretch against good-quality teams, which will take care of their TUC Record whether the Crusaders are above the cliff or not.That's what I think makes the PWR interesting, if maddening to analyze: the fact that Yale's chances might well depend on which of Colorado College and Holy Cross has a better season from here on out. One is very good for the Elis and one is very bad. BUT... you don't want any team to sink TOO far below the TUC line or it begins eating into your RPI, whether you beat them or lost to them. In an ideal world, every team you lost to should have an RPI of .4999 to maximize your chances of getting in.
Within reason. There are always exceptions.Thanks for your analysis, "just win baby!' helps in all cases as well.
The one team that really jumped out to me is BU. Their 4-6-0 TUC Record will significantly hurt them down the stretch (especially the fact that they're 3-6-0 against teams in the Top 10 of the RPI, teams that are extremely unlikely to fall off of the cliff), as will their 1-2-0 record against the WCHA and their 0 games played against CCHA teams this season. Their poor performance against the WCHA renders the COp comparison lost against nearly every WCHA team right off the bat and no games against CCHA teams means that they have one less opportunity to hide their poor TUC Record against most CCHA teams.
At BU's current RPI, once the TUC Record comes into play, they will lose the Dartmouth and North Dakota comparisons. Their comparison against Wisconsin is also hanging on by just a thread (both teams have a 0.400 TUC win percentage). Further, once their TUC Record comes into play against Colgate, potentially Cornell, and Denver, BU's RPI is the only thing keeping them winning those comparisons.
The Terriers could easily drop 4 comparison wins, just by playing out the rest of the regular season, pushing them much closer to the cut line than they are currently.
I could be wrong, but I think there is an error in the current standings on USCHO. BU has a 4-6 record vs TUC on the main PWR rankings page, but if you look at the individual comparisons it is showing a BU record of 4-5 vs. TUC and thus is not awarding a PWR point for TUC in all BU comparisons.
Right, H2H games are not also included in their TUC record. So, when looking at the DU / BU comparison, they will only have a 4-5-0 record. And, in the BC / BU comparison, it is 3-4-0, but against Miami (for example) it does come into play.I wonder if that 4-5 record is in a comparison with a TUC they have played. In that case, that game does not count on the TUC record.
I can simulate everything except across the conference tournament break. So, for now I simulate up to the end of the regular season, then once the tournaments are seeded I can simulate to the end of the conference tournaments.
If there's interest in any particular output/results, or even some new type of interactivity so you can explore the data yourself, let me know and I'll see what I can do. I'll probably get my first post of the season up this week.
Basically, there are three kinds of variably scheduled games in college hockey:
* Best of three series
* Tournaments where games are played by winners/losers of other games
* Conference tournaments. After the first round, they all fall into the 2nd bullet above. In the play-in rounds they depend on standings and may involve ranking and reordering the bracket. Each is different, which is a little annoying; but the real killer is the CCHA which uses shootout results that I don't have in my games database.
Jim, lets talk after I get something written up... Figuring out the playoffs is mostly a game of efficiently programming tie-breaking rules.
I am not a computer scientist nor a true programmer (and I really get sick of needing to be a jack of all trades.)
I think if the work can be parceled in some way then something useful can be done.
So, if that's right, it seems to me that Jim does what is called a series of "Monte Carlo" runs. Now, if I understand that right, that means he uses some kind of comparison between each team that gives odds of winning each game (KRACH would work nice here), and uses some random number generator or other programming tool to 'pick' the winner of each game, in accord with those odds. Do that for every game, add up the PWR at the end. Then repeat. Do it a bunch of times (1000? I think), and you can say the "odds" of UND getting a #1 seed are (insert number here in %). Does that seem right?
If so, that is a nice tool to have available to those of us curious about this matter. Because, those odds will fill the rest of the games. And, in the end, we will see some things changing from the present PWR that we might not guess at ahead of time.
For example, I suspect that Lowell will come out lower than they are right now, for reasons discussed above. BU perhaps, too. So, that "Monte Carlo" odds calculator will help us learn who is in a more advantaged situation, and who in a less advantaged situation, that we can see just from the PWR numbers now (I mean the # of comparisons won).
A Monte Carlo simulation of the remainder of the season (with PWR calculations at the end of each simulated season) is fairly simple to do, but there are, it seems to me, two big problems (and one small) with it. First, KRACH doesn't predict ties, or at least I don't see any simple way to get it to do so, and ties are a nontrivial component of PWR. In addition, the KRACH predictions should at least be modified for home ice, although that's fairly simple to do as long as you have a good way to make the estimate. Second, we would be fixing KRACH at today's level which really introduces a whole new set of uncertainties. It makes no sense (theoretically or in terms of computer effort) to dynamically update KRACH for pseudodata, but you will find yourself in the position of making predictions that you'd never make in real life, where teams with great records (in the pseudodata) are getting thumped by lesser teams because of lucky runs. This problem is in some ways philosophical rather than practical, but it's a big one.
One other fairly sizeable problem is programming the playoff rules in every conference, which is really a pain.
There are adaptations of Bradley-Terry (which is known by college hockey fans as KRACH) that account for ties or home-ice advantage. I haven't seen them used in conjunction, but it shouldn't be overly difficult to account for. There was a paper published by three Taiwanese professors that details the additional factors used for either adaptation (PDF).A Monte Carlo simulation of the remainder of the season (with PWR calculations at the end of each simulated season) is fairly simple to do, but there are, it seems to me, two big problems (and one small) with it. First, KRACH doesn't predict ties, or at least I don't see any simple way to get it to do so, and ties are a nontrivial component of PWR. In addition, the KRACH predictions should at least be modified for home ice, although that's fairly simple to do as long as you have a good way to make the estimate.
Θq1
-------- if T1 is home
Θq1 + q2
P(T1 beats T2) = {
q1
-------- if T2 is home
q1 + Θq2
q1 = B-T Ranking of T1
q2 = B-T Ranking of T2
Θ>0 = strength of home-field advantage
q1
P(T1 beats T2) = --------
q1 + Θq2
q2
P(T2 beats T1) = --------
Θq1 + q2
(Θ^2 - 1)(q1)(q2)
P(T1 ties T2) = --------------------
(q1 + Θq2)(Θq1 + q2)
q1 = B-T Ranking of T1
q2 = B-T Ranking of T2
Θ>1 = "threshold" parameter within which teams can be considered equal aka a tie occurs
Boston College 600.23
New Hampshire 586.81
Quinnipiac 461.19
Minnesota 459.13
Notre Dame 417.67
Boston University 387.60
Denver 322.94
North Dakota 294.09
Dartmouth 256.29
Yale 252.97
Miami 219.21
Nebraska-Omaha 216.85
UMass Lowell 209.39
Western Michigan 208.80
Minnesota State 195.70
St. Cloud State 179.53
Cornell 153.99
Colgate 150.95
Wisconsin 144.62
Northern Michigan 141.57
Providence 133.96
Colorado College 131.42
Lake Superior 128.97
Robert Morris 122.56
Minnesota Duluth 120.11
Ohio State 120.06
Union 119.94
Niagara 117.32
Ferris State 108.67
Alaska 106.87
Massachusetts 103.18
Rensselaer 89.64
Merrimack 87.57
Michigan Tech 87.34
Princeton 83.70
Vermont 81.27
Bemidji State 79.79
Holy Cross 77.24
Michigan State 76.62
Harvard 76.16
Bowling Green 72.58
St. Lawrence 67.70
Brown 63.15
Michigan 57.40
Northeastern 56.93
Mercyhurst 53.39
Alaska Anchorage 52.53
Maine 49.83
Connecticut 44.90
Air Force 41.93
Clarkson 39.21
Canisius 38.93
Bentley 33.75
RIT 24.93
Army 24.91
Penn State 22.51
American International 12.25
Alabama-Huntsville 7.03
Sacred Heart 2.66
Even if you are updating KRACH after every weekend or so with the pseudodata, it's still going to underpredict the upset runs that teams will go on while riding a hot goalieSecond, we would be fixing KRACH at today's level which really introduces a whole new set of uncertainties. It makes no sense (theoretically or in terms of computer effort) to dynamically update KRACH for pseudodata, but you will find yourself in the position of making predictions that you'd never make in real life, where teams with great records (in the pseudodata) are getting thumped by lesser teams because of lucky runs.
Yes it is. Especially since the AHA doesn't really publish their info, HEA's guidelines could be interpreted in three different ways, and the CCHA uses shootouts.One other fairly sizeable problem is programming the playoff rules in every conference, which is really a pain.
Jim and Pat,
Thanks for all you fellows bring to discussion like this. Can I ask a few questions? I seem to remember from some discussion last year that it is impossible to do a full-up "odds" considering every possibility, because the number of games is too many, and the PWR would have to be calculated at the end of each combination of games.
So, if that's right, it seems to me that Jim does what is called a series of "Monte Carlo" runs. Now, if I understand that right, that means he uses some kind of comparison between each team that gives odds of winning each game (KRACH would work nice here), and uses some random number generator or other programming tool to 'pick' the winner of each game, in accord with those odds. Do that for every game, add up the PWR at the end. Then repeat. Do it a bunch of times (1000? I think), and you can say the "odds" of UND getting a #1 seed are (insert number here in %). Does that seem right?
If so, that is a nice tool to have available to those of us curious about this matter. Because, those odds will fill the rest of the games. And, in the end, we will see some things changing from the present PWR that we might not guess at ahead of time.
For example, I suspect that Lowell will come out lower than they are right now, for reasons discussed above. BU perhaps, too. So, that "Monte Carlo" odds calculator will help us learn who is in a more advantaged situation, and who in a less advantaged situation, that we can see just from the PWR numbers now (I mean the # of comparisons won).
Thanks again fellows. My natural interest in these things is why I named myself "Numbers."
There are adaptations of Bradley-Terry (which is known by college hockey fans as KRACH) that account for ties or home-ice advantage. I haven't seen them used in conjunction, but it shouldn't be overly difficult to account for. There was a paper published by three Taiwanese professors that details the additional factors used for either adaptation (PDF).
For the home-ice advantage, the probability is shown by:
The log of the probability of each game is added and the KRACH and theta values are adjusted to maximize the sum (this can be done by Excel Solver or by someone with coding experience in a different language, such as R).Code:Θq1 -------- if T1 is home Θq1 + q2 P(T1 beats T2) = { q1 -------- if T2 is home q1 + Θq2 q1 = B-T Ranking of T1 q2 = B-T Ranking of T2 Θ>0 = strength of home-field advantage
The problem I have with this method is it seems to apply the same home-ice advantage to each team, which is obviously false. I thought I had seen another paper discussing ways to introduce home-field advantage to Bradley-Terry, but I can't find it right now.
To include the possibility of ties in Bradley-Terry:
And, the same thing is done here, we take the log of the probability of the result of each game and adjust the KRACH and theta values to maximize their sum.Code:q1 P(T1 beats T2) = -------- q1 + Θq2 q2 P(T2 beats T1) = -------- Θq1 + q2 (Θ^2 - 1)(q1)(q2) P(T1 ties T2) = -------------------- (q1 + Θq2)(Θq1 + q2) q1 = B-T Ranking of T1 q2 = B-T Ranking of T2 Θ>1 = "threshold" parameter within which teams can be considered equal aka a tie occurs
I have been calculating team's ratings based on the tie-adjusted Bradley-Terry system and here are their rankings, for comparison (Θ = 1.43):
Using these adjusted ratings, a team has a 17.6% chance of tying itself, 15.8% against a team with a KRACH two times its own, 13.1% against 3x's, etc. For example, a UMass v UAA game would no longer have an (approximately) 66.7 / 33.3% win split. It would now have a 57.9 / 26.3 / 15.8% W/L/T split for UMass.Code:Boston College 600.23 New Hampshire 586.81 Quinnipiac 461.19 Minnesota 459.13 Notre Dame 417.67 Boston University 387.60 Denver 322.94 North Dakota 294.09 Dartmouth 256.29 Yale 252.97 Miami 219.21 Nebraska-Omaha 216.85 UMass Lowell 209.39 Western Michigan 208.80 Minnesota State 195.70 St. Cloud State 179.53 Cornell 153.99 Colgate 150.95 Wisconsin 144.62 Northern Michigan 141.57 Providence 133.96 Colorado College 131.42 Lake Superior 128.97 Robert Morris 122.56 Minnesota Duluth 120.11 Ohio State 120.06 Union 119.94 Niagara 117.32 Ferris State 108.67 Alaska 106.87 Massachusetts 103.18 Rensselaer 89.64 Merrimack 87.57 Michigan Tech 87.34 Princeton 83.70 Vermont 81.27 Bemidji State 79.79 Holy Cross 77.24 Michigan State 76.62 Harvard 76.16 Bowling Green 72.58 St. Lawrence 67.70 Brown 63.15 Michigan 57.40 Northeastern 56.93 Mercyhurst 53.39 Alaska Anchorage 52.53 Maine 49.83 Connecticut 44.90 Air Force 41.93 Clarkson 39.21 Canisius 38.93 Bentley 33.75 RIT 24.93 Army 24.91 Penn State 22.51 American International 12.25 Alabama-Huntsville 7.03 Sacred Heart 2.66
Surprisingly, the estimated tie rate is actually pretty accurate. There have been 83 ties so far this season and using the theta of 1.43, if you add up the probability of a tie in every game that has been played so far, the total is 83.30.
Even if you are updating KRACH after every weekend or so with the pseudodata, it's still going to underpredict the upset runs that teams will go on while riding a hot goalie
Yes it is. Especially since the AHA doesn't really publish their info, HEA's guidelines could be interpreted in three different ways, and the CCHA uses shootouts.
I know that was way more math intensive than most people were looking for, but hopefully it was useful to some.