cMax likes being right, but loves being proved wrong. - Chris Maxwell
Each Monday morning of the spring racing season, starting sometime in February, a spreadsheet arrives in the inboxes of Chris Maxwell--the namesake of cMax--and row2k. Sent by Andrew Kirk, the data therein is a just-the-numbers distillation of every collegiate Varsity 8 result in the country that weekend, from the fastest crews in the country through to tiny programs boating a sole crew; it's all there.
Throughout the weekend, Kirk, who runs a quantitative hedge fund, has scraped the row2k results pages for V8 results, applying his understanding of math and data in equal parts to his understanding of rowing, with which he has been closely involved since the 1980s. Kirk rowed at Columbia, still sculls out of Cambridge Boat Club, and the sport continues to be a major part of the lives of his family members, like daughter Emma who won the Youth Single at the Charles last year.
Along with the data, Kirk outlines any questions he has on the provenance of certain crews--is that Penn's 2V, is that Clemson's 1V or is it their club program, who is racing in that 'Open 8' in San Diego, what the heck Washington boat is which--and row2k shares it around our inboxes to try to confirm Kirk's suspicions one way or the other.
Once there is a relative certainty about the data, Maxwell dumps it all into his model, which relies on 'a well-known statistical technique called OLS (ordinary least squares),' as outlined in the Methodology section posted on below every weekly poll.
Maxwell, who teaches Game Theory and Econometrics as well as several popular courses on Sports Economics at Boston College, and whose wife and children are rowers, has tweaked the model to account for race conditions, how recent a race is, how close a race is and whether the teams would have been 'pressed,' whether it was a heat or a final, and more.
Before he runs the model, Maxwell has no idea what the output will be. There is a 'butterfly wing' effect in play, where racing by 'unranked' crews could even affect the crews at the top of the ranking, and when Maxwell sends the final poll to row2k each week, his comments regularly reflect different ways he was surprised by the output.
Then the data is posted to the site, and the debate begins. The cMax team sits tight for a couple days until racing starts up again, and the conversion of racing from college kids racing their hearts out to data points resumes once again.
cMax Origins: A Data-Centric Rowing Dad
Maxwell's sports modeling started with his daughter's third-grade girls soccer league. His wife was in charge of scheduling, so he devised a model that controlled for strength of competition to help his wife put together a balanced schedule.
When his oldest daughter started rowing in high school, he had a new sport to start figuring out.
"Her team was pretty good, so I thought, well, maybe I can try to use my model to see how good," Maxwell recalls. "My goal at the time was to link up the New England high school teams with the Mid Atlantic high school teams, and that was a total failure."
The problem in that first model was that he could only find one school at the time which competed against schools from both regions, so there was not enough connectivity in the data to make the model work.
When the family's high school rower went off to college in the early 2000s, Maxwell started paying attention to collegiate rowing and found a pretty robust data source: the row2k results page.
Irreverent 'Consultants'
Maxwell first shared his rankings with an email group that included a few rowing Olympians he had worked with as an economist: Tom Murray, Malcolm Baker, and Mike Peterson. Since Maxwell felt like he needed to know more about the sport to model it correctly, he asked his rowing 'consultants' endless questions about how rowing worked and the ways coaches think about results and boatings.
Those emails exchanges became increasingly irreverent and filled with inside jokes as the season went on, not unlike the Monday morning exchanges that Maxwell, Kirk, and row2k still have each week, although it is hard not to believe those early discussions were a bit more spicy.
Eventually, Murray suggested Maxwell reach out to row2k about sharing the poll more widely, so Maxwell sent row2k an email in 2003--on April 1st of all days--to see if there was any interest. Part of his pitch? That he could provide a ranking that would cover many more teams than any of the coaches polls of the time.
While cMax now is a weekly update, nearly in real-time, of how fast over 100 men's and nearly 200 women's varsity eights are relative to each other--and accurate enough that coaches and athletes take the margins therein seriously enough to trust (and argue over)--that only evolved over time.
"Early on there was a lot of experimentation and the models changed a lot," Maxwell remembers. "There was a ton of work, and a huge difference between where I was when things started and where I ended up with row2k. I pretend the model is simple stuff, but it really isn't.
"It's not just getting the race times right, but also figuring out which team was where, and whether it was the club team or not," says Maxwell.
Eventually, figuring all of that out was what Kirk brought to the effort. "Andrew is just awesome at catching all that," according to Maxwell.
'The First Discussions Were Phenomenal' - Kirk Brings Data Expertise to the Mix
The two met in 2009, not long after Kirk wrote to row2k to share his own statistical model, which he had been doing independently and had even showed to Baker, who was by then teaching at Harvard Business School. row2k asked Maxwell if we might put them in touch, and he replied enthusiastically--in an email that included detective work finding missing WCC results from the previous weekend that were buried in a team press release; Maxwell is always on task.
Not long afterward, row2k drove up to Cambridge for a fateful meeting with the two data masters at the now-closed Starbucks near the Riverside Boat Club; two hours of high-level nerding out, data parsing, and comedy later, a team was born.
"The first discussions I had with Chris were phenomenal," Kirk remembers. "It was nice to meet someone who does take a passion to data. There are not a ton of people who really do that to the meticulous level that Chris has. But then his viewpoint on the sport is very different from mine. He's looking to model it. He's looking to figure out where the data has value in a classic modeling sense.
"I come to it just from a rowing perspective. When I rowed in college, a crew could get open water and seem dominant, but it really was only a handful of seconds that separated teams. The actual difference was always a question, back before you had the information that we have now."
Kirk's data-scraping plays a key role, and what folks see on row2k each week in the cMax rankings is what Maxwell’s model makes of the data set Kirk has passed along--all of which Maxwell still prints out, collects in massive binders, and meticulously double checks.
"The model is so sensitive to data errors," he explains, "but I typically find almost zero errors in Andrew's work, which is amazing."
"I tinkered with my model fairly regularly early on, but I haven't really changed much about the model in the last half dozen years or so," Maxwell says. "Nobody would believe how much time I spend putting all the results into those beautiful excel spreadsheets that I send to row2k. Everybody thinks you push a button, but as anybody who is in this business knows, you don't just push a button. That takes time.
"I estimate different models along the way, and they pretty much all agree by the end of the season. But at the start of the season, everything is very dependent on the data, and the connectedness of the data, which Andrew is an expert on, and how easy is it to compare teams across the country from each other."
"I work in the financial world," says Kirk, "and the data that's now available online is incredible in that you can get any sort of time series data on anything that moves in price.
"I got really good at being able to go through the Internet and find that stuff and then compile it, which in any model is ninety-five percent of the work.
"When I start getting interested in looking at margins in rowing races, I realized that I could do it very quickly and at least be able to compile the data in a nice format so that when I send it off, Chris can look at it and hopefully save him some work in terms of finding all the stuff."
The Real Challenge: Figuring Out Who is Who
Two sets of eyes on what Kirk compiles each week makes the data set "amazingly strong," according to Kirk.
"We use all the information that we have at row2k," says Maxwell, "and if it's just two teams racing and someone catches a crab, usually that makes the write-up in the comments on row2k. But in big regattas, we don't get any of that, and that makes it more complicated. On top of that, not only do we have trouble identifying teams, we also run into this phenomenon where some teams race their 2V's in 1V races.
"Overwhelmingly, the real challenge is getting the data and figuring out who's who."
"When we first started doing it together," Kirk remembers, "we were struggling over things like whether it was the College of Charleston or University of Charleston, and they wouldn't really say, and you had to go through the team websites, which was sort of iffy back then.
"But there's certain things that happen every year, like at the Windermere Cup, where those aren't the boats you think they are, or at the Knecht Cup, where those are the lightweight women racing in the open category, that sort of thing."
For cMax to work, according to Maxwell, "row2k is the entire source and has always been the only source for the data.
"Every once in a while, I'll find a team that is racing, but not reporting their results," he says, "and I will go to the team's website and grab some stuff if I find it. But, all the data really comes from row2k and, in fact, I'm not really sure where else we might look other than going through all the team websites."
"It's a central hub for all the data," says Kirk. "It's in one spot and it makes it really easy, to know that you can get any result that's happened in the world and it's on the site.
"The ability just to go through that list of races makes it really easy to compile the data."
"Getting those numbers right is just hugely important," Maxwell adds. "I teach Econometrics, and at the start of the semester, I talk about what's really important for doing good econometrics, and important topic number one is data integrity."
In fact, back in 2004, Maxwell's first year sharing the rankings for publication, the model turned out to be not quite ready for prime time--thanks to single data input error on a single crew at the San Diego Crew Classic--so cMax did not launch publicly until the following season.
"San Diego played this important role of bringing together East and West Coast teams," Maxwell explains, noting that it still functions that way today. "That year never happened because we couldn't find the data problem."
Context and Methodology
The context and methodology of the model is hugely important as well, which is one of the reasons it posts each week with an explanation of the method, and a disclaimer.
"Way back when when we started, it was Ed Hewitt's piece of genius to include a line saying, this is for entertainment purposes, don't take it too seriously everybody, and that was brilliant," Maxwell says.
"I came out with the first week's ratings this year, and the coaches have Texas in first and I don't have Texas in first. And I think, uh oh, they're all going to say that I'm an idiot."
For the record, Maxwell's 2025 models have suggested all along that the Stanford Women's V8 was the fastest crew, a prediction that the April 25th Longhorn Invite results confirmed. As it turns out, the Cardinal V8 is in fact the #1 seed for this week's NCAA Championships.
[ed. note: it is worth pointing out that the CMax rates the varsity eights only, whereas the Pocock CRCA coaches poll that had Texas in first for the first six weeks this year is a ranking by team, just like the NCAA DI team championship.]
What makes rowing tricky to model, even with a great data set, is that rowing races are not games played under uniform conditions, making the sport's data points very unlike what Maxwell used in his first soccer models or in any of the ones he has built over the years for the other sports.
"All the races are held under very different conditions, with very different lineups, with very different strategies."
It is nothing like, according to Maxwell, "rating European football where they play with basically standardized rules and standardized fields, or American baseball or basketball. It's just a very different challenge."
"I have models of European football and my rule is, by the end of February, we pretty much have good forecast for the end of the year, but I've never thought that at all about rowing."
In addition, Maxwell found that he had relatively few results to work with, given the nature of the short college racing season where, in reality, very few of the hundred-plus teams he ranks race each other head-to-head. "We're forecasting with very little data. If anybody knew that we could forecast with three observations, they would laugh. That's just insane."
'Seconds Back' Model vs. a 'Wins' Model
Perhaps, but it works, because cMax is a speed model, looking at the seconds back and relative speeds between crews.
"The speed models tend to be better because you don't need as much data to estimate things," says Maxwell, who also runs what he calls a 'Coaches Model' that is based on wins that "only pays attention to who beat whom and doesn't pay attention to margins." By the end of the season, he says that coaches model starts to "catch up" to the speed model and they look similar.
"The reason I call it the coaches model is because that model, which only pays attentions to wins, actually ends up looking a lot like the coaches poll," Maxwell says. "Human beings are good at processing that kind of information and have a harder time processing information like a three second differential here and two seconds there. That's where computers can be helpful.
"When it started, maybe because I was not steeped in the rowing world, I started by thinking that a speed model made sense, but I was not realizing that was not the way the rowing community thought about race results. The rowing community thought very much in terms of who beat whom and not so much by how much, so this concept of 'seconds back' was, at the time, pretty different.
"And now I don't know that it's so different anymore. The idea is that you could have two teams where one was right behind the other, but they're really close together, and the seconds back will tell you that. I thought that was more useful information and frankly to estimate the model with not a lot of data, you get a lot more information if you know margins of victory as opposed to just whether or not a team won."
'And cMax Says...'.
Indeed, while margins have always been something coaches might take into account, cMax model has made the 'idea of seconds back' a thing in college rowing.
Maxwell remembers watching a Cal racing video a few years back, interested to see how the model would perform, when the announcer actually referenced the cMax prediction.
"Literally just at the start of the race, the announcer says, 'And cMax says...' That was really the first time I ever had a sense that folks talked about it."
Another time, Maxwell bumped into a coach in an elevator at the NCAAs, who told him she talked about cMax with her team all the time.
"She said that what she liked about the ratings was that if her team lost, but they beat the cMax margin, then they should feel good about themselves, and if they won and they didn't beat the cMax margin, then she could get on their case. It never dawned me that it might play that sort of role."
Maxwell has even had a few coaches contact him to ask about using data from the poll in talking with their athletic directors, since cMax generates information about relative speed even for teams that are not ranked in coaches polls that only go twenty or twenty-five teams deep.
cMax Watch the Racing As Closely as We All Do
Doing cMax for nearly twenty-five years has turned Maxwell into an avid watcher of the sport. In fact, since Maxwell and Kirk track both men and women across divisions and the club/varsity divide, you could argue the two of them see more rowing results each week than anyone else, coaches included. Even at row2k, no single person is looking at each result that gets posted, let alone parsing them.
"It's a forecasting model, says Maxwell, "and so every week you look at what actually happened.
"When the Texas women raced Michigan, and when Brown raced the Rutgers women, I was interested. If you're relatively close or at least if you predict the winner, that's good, but if you're not, then you wonder a bit about why that happened."
On the men's side this season, cMax has had Cal on top all season, ahead of Washington, which turned out to be where Cal finished when they won the MPSF title two weekends ago--and the model had kept Cal at #1 even after they lost to Washington the first time the two raced in April at The Dual.
"As we saw, Washington beat Cal, and then Washington did get rated closer to Cal by cMax, but the model didn't change," Maxwell says, on why he still had Cal at #1 despite the upset. "I was curious, to be honest. Andrew asked me before I ran the model, what are you going to predict? I've been doing this long enough that I looked at it and I thought, you know, that's what the model says, and I have faith in the result. So I'm just going to go with it and everybody can roll their eyes."
Kirk and Maxwell chatting about their predictions about the predictions is very much the kind of back and forth that goes on week to week.
Complicated, Limited, But Actually Quite Accurate
The model predicted the six finalists at Men's Sprints, too, and got the gold and silver medalists right, in Harvard and Dartmouth. Harvard's actual loss to Brown at the Sarasota 2k in March, which hurt them in some coaches polls initially, never knocked them out of the #3 spot on cMax. The model had them ahead of the other EARC schools by the end of April, which is exactly where Harvard finished in winning Sprints two weeks ago.
When Maxwell and Kirk are crunching the data each week, they are running what Kirk finds through each of their own models: Maxwell applying the secret sauce he's refined over the years, and Kirk dialing in other nuances, like giving more weight to recent results. While cMax is the product of Maxwell's model, they compare the results each come up with says Kirk.
"When I look at the CMax ratings and the stuff that I do for my personal consumption, mine are no better," says Kirk. "There's only so much we can squeeze out of the data that can predict what's going on.
"But that said, the standard deviation between the IRA results each year and what I predict the IRAs would be is about maybe a length, four seconds, and Chris's is the same. If you think about a race that's 400 seconds, we basically have a standard deviation of one percent on the eventual outcome.
"Even though it's very complicated and we have limited data, we only have margins and even that is kind of iffy, it's actually quite accurate," says Kirk. "We're within one percent of the actual margin, and it's very rare that a boat that's predicted to be open water ahead against another one will lose to that boat in a head-head race.
"So it's actually reasonably good and I was always surprised by that.
"That's what got me into the modeling. I was tinkering with it because, as a master's rower, I was looking at other people's head race times and seeing how they would do at the Head of the Charles. When they got there, it was almost within a couple seconds of either side, and there's a lot more data in collegiate rowing."
While the data set does get deeper as the season rolls on and there are more results Kirk can gather up and feed to Maxwell, they say there is not really a point in the spring where the model changes radically.
"There isn't a tipping point," says Kirk, "but the rule is the more data, the better the model."
At the same time, while the data collecting starts in late February with the first races on the west coast, the cMax team occasionally waits for a bit at the beginning of the season to make sure the model is stable enough to start publishing.
"We did it the April 16th week this year and I was still curious about things shaking out on the women's side," Kirk says. "It could be really, really wrong. But then the ratings actually proved out well."
There was one school this year, even in the early going, that was a bit of a surprise, says Maxwell, and it turns out the model was right.
"This year, we had Rutgers near the top, and Rutgers is near the top. They're fast.
"You just hope you've got solid ratings by the last race of the season, that's the goal"
"One thing that might interest people," says Maxwell, "is if the model predicts something different at the top for the last race of the season.
"If 'so and so' was in first for the last six weeks, then all of a sudden the model says, 'Nope,' looks like they slipped into second? The model's not always right, but it is right more often than it is wrong.
"I've always thought it was kind of interesting to see what happens, at the top, because those teams are going to the NCAAs and IRAs. If there's a team that sort of jumps ahead or falls behind in the week before those final events, I've always thought that was worth paying attention to."
In 2025, cMax's number ones have stayed stable--the Stanford Women and the Cal Men--but in 2024, Washington jumped to number one late in the season, after Princeton and Cal each took turns on top, and of course Washington did win the IRA, though cMax did not see Harvard at #4 grabbing a silver medal.
"It's always a surprise, at least to me," says Maxwell, about when it happens, "and it's kind of interesting to see how it all plays out.
For the record, cMax sees the 2025 NCAA Grand Final as Stanford, Texas, Tennessee, Princeton, Washington, and yes, Rutgers, with Yale ranked seventh; while the IRA would be Cal, Washington, Harvard, Dartmouth, Princeton, and Stanford, with Brown ranked seventh.
How will it actually play out when the data points revert back to college kids racing their hearts out?
We'll see this weekend, and you can be sure that Maxwell and Kirk will be watching, too, because, after all, the model does love being proved wrong.
Comments | Log in to comment |
There are no Comments yet
|