RankOAR's First Season on Row2k
As with all data science projects, interesting predictions don't mean anything unless they are both accurate and add something new to what human intuition predicts. We hoped RankOAR would be close to the coach's rankings, as that would be a sign that it is beneficial. If it could be better at predicting the conference championships than the coaches, it would be a tremendous success.
With all the D1 conference championships the weekend of May 13-15, there were a lot of data points to compare RankOAR to the best experts in the field, specifically, the seedings produced by the coaches of each D1 conference. The coaches are the best experts with the most up-to-date knowledge of each boat from each team providing a "gold standard" ranking of each event. This should be much more accurate than national rankings comparing teams that don't commonly race each other and coaches that don't talk to each other several times a season. And, immediately after the coaches produced those rankings, the boats race exactly those events, allowing us to compare predicted vs actual results. With 349 boats that are both ranked by RankOAR and ranked by their own coaches, we have a lot of datapoints and can produce meaningful comparisons.
A little more than a week ago, we moved to version 1.2 of our ranking system. We tweaked the RankOAR algorithm to lessen the importance of older races and decrease the expected variations in performance between races to better reflect what we were seeing on the water. We made our last rankings prior to the NCAA selections with this version. Running a comparison now lets us compare both the 1.0 and 1.2 versions of RankOAR, using only data available prior to the Conference Championships, against the coach's predictions. To weight the results to more heavily punish larger errors, we closely mirrored the statistical concept of variance, taking the mean of the square of the difference between expected and actual finish of each race across all races. This means that if the top seeded boat finished second, this produces a variance of 1, while if the top seeded boat finished 9th, this produces a variance of 64. Lower scores indicate that the rankings more closely reflected the future race results. No priority was given to accuracy in higher seeds compared to accuracy in lower seeds, though there is ample empirical evidence across many sports that the middle seeds are the most difficult to predict accurately. Having spent several evenings comparing various predictions, I am very pleased with how accurate RankOAR is predicting college rowing races.
While a single conference is not a large enough sample to say anything meaningful, it is fun to identify who did well at rankings and who did not - though accurate predictions probably coincide with less thrilling racing. The AAC excelled in accurately seeding their races, with a mean variance of 0.320. The MAAC came in second with a 0.828, followed by the Big 12 with a 0.914. All three of those conferences beat the RankOAR 1.2. Overall, the coaches had a mean variance of 1.473 across all 349 comparable boats. RankOAR 1.0 had a mean variance of 1.461, which is a statistical tie. (Yay!!! Our first take on the algorithm equaled the best subject matter expert's take on the rankings.) RankOAR 1.2 did even better, with a mean variance of 1.269, exceeding the coach's rankings by 14%.
For more details, Figure 1 breaks down the comparison of accuracy by different conferences and events.
This weekend will show whether we were better at ranking the NCAA Championship than the collective wisdom of the NCAA seeding committee. (Hello Vegas!)
Looking ahead to next year, we would love to add rankings for the men's teams. However, we will need some help with automated data import and data cleaning to do that. If you are interested in helping or if you have any suggestions or see obvious errors in RankOAR, please reach out to: rank(remove_this)oar@carmody(dot)ws. And, for those who were wondering about the name, suffice it to write that we are Star Wars nerds at heart.