Re: Boston College Women's Hockey 2018-2019: NO COMMERCIALS, NO MERCY
What you guys are talking about starts to look like what is sometimes called the 'Pythagorean Theorem of Baseball' or the 'Pythagorean expectation', developed by SABR pioneer Bill James.
Not really. Pythagorean expectations is good for determining a team's expected winning percentage, and is a better predictor of future winning percentage than actual winning percentage is. However, ARM is asking a slightly different question. Inherent in pythagorean analysis is an assumption that it doesn't matter whether you generate your goals/goals allowed ratio by scoring more, or by giving up fewer. ARM is either suggesting or implying one of two things: that there comes a point at which you can't score enough goals to compensate for giving up more; or that the pythagorean formula breaks down at the extremes.
Either one of these could be true. It is definitely the case that there are diminishing returns to becoming better at something you are already good at. If you have an excellent offense and a mediocre defense, it is likely that you will have an easier time reducing the number of goals you allow than to increase the number that you are scoring. So, there is value at looking at the question of just how many goals a team can give up before you conclude that they are unlikely to be able to win a national championship, no matter how good their offense is.
Aside from my comment about needing to adjust for context, I'd also say that looking only at the actual national champions leaves you with too small a sample to generate meaningful results. I'd probably extend it to looking at all Frozen Four participants, both to give a larger sample and because once they make it that far, it's possible to for any team to have a fluky weekend and win it all.
Basically, it isn't necessarily how many runs a team scores, nor is it the number of runs they give up, but over time, it is the differential of runs scored vs runs given up that will predictively determine how many games they win.
To be mathematically precise, it is the ratio of runs to runs allowed that's important, not the differential. To be even more precise, it is that ratio raised to an exponent. The original pythagorean analysis that Bill James created used an exponent of 2, which is why he used the term "pythagorean" to describe it. Later work by Clay Davenport showed that the proper exponent was ~1.8, and that it fluctuated over time. In general, a higher scoring era resulted in a larger exponent, though that wasn't always the case.
As the Wikipedia article mentions, it has been applied (with varying measures of success) to other sports, including hockey; though I assume - without looking - that the application to hockey was to the NHL, where 'over time' involves more games, and would become a better predictor..
It's actually been very successful across sports, once you adjust the exponent. For basketball, it's much greater than 2; for soccer, a lot less. The values for the NHL are more stable and predictive not just because they play more games, though that's a part of it, but also because the gap in talent between the best teams and the worst is smaller than it is at the college or most minor league levels. As I suggested up top, it wouldn't be surprising if the pythagorean formula starts to break down at the extremes, and trying to use the same formula to predict both Clarkson and RIT could be difficult. A few years ago, I noodled around with the numbers a bit and what I saw was that the exponent for Division 1 women's hockey is probably in the 1.7-1.8 range, but I wouldn't put any level of confidence in that given the very cursory nature of my investigation.