Thursday, July 10, 2014

Does Past Games Experience Matter? (And Other Thoughts On Predicting the Games)

Today, I'd like to tackle a topic that's been mentioned quite often in CrossFit Games commentary. It's basically an assumption that's been taken as fact: having experience competing at the CrossFit Games in the past gives an athlete an advantage over first-time Games competitors. I've generally believed this to be true, but without data to support it, we're all really just guessing.

But let's start with the reason I decided to look into the issue. About a week ago, I released my 2014 Games predictions, which were used in the CFG Analysis Pick 'Em that is going on right now. What I started to notice as picks came in is that people tended to wager much more often on past Games competitors. One reason for this is simply familiarity: people know these athletes and have seen them perform well (or perhaps they just like cheering for them). But I believe another reason is that people tend to assume that Games experience matters. And the reason that is a factor here is that my model does not take past Games experience into account.

Why? Well, in constructing these models, I wanted to be able to predict the chances that an athlete would win (or finish top 3 or top 10), not simply make a prediction about where they would finish on average. That meant I couldn't just set up some sort of linear regression model that could account for several variables (such as Regional placing, Open placing, past Games placing, etc.). I needed a model that could generate a range of outcomes, and I felt using this year's Games results was my best bet. This is a different approach than I took for the regionals predictions, for three reasons:
  1. Entering the Regionals, there had only been 5 events thus far this season, which is not really enough for me to get a sense of what types of events might come at the Regional level.
  2. Some athletes notoriously coast through the Open, so those results alone would not be a great predictor of Regionals success.
  3. Because so many more athletes compete at Regionals each year compared to the Games (~1400 vs. 85), there was enough historical data for me to build a decision-tree-style model for Regionals. There is simply not enough past Games data to learn about what characteristics give an athlete a chance of winning. Basically what we would learn is: "In order to win, be Rich Froning."
So my solution was to build a pretty unique simulation model that took into account specific results for each athlete for all 11 events this season prior to the Games. It's at this point that I'd like to recite a quotation that one of my work colleagues (a predictive modeling guru if there was one) likes to bring up quite often:

"Essentially, all models are wrong, but some are useful." - George Box

Any model we come up with to predict the CrossFit Games will be wrong. Remember, a perfect model would predict with 100% certainty exactly who would finish in what positions. No one would have a 20% chance of winning - they would have a 100% chance or a 0% chance. But creating such a model is impossible. So with that in mind, I acknowledge that my model is wrong. But is it useful? I think so.

The chart below shows the calibration of this model on the 2012 and 2013 Games (combined men and women for both years). This shows how often athletes finished in the top 10, compared with the chances I gave them. A perfectly calibrated model (not necessarily a perfectly accurate one) would have the blue line follow the red line exactly, so that for the athletes I predicted with a 7% chance to finish top 10, exactly 7% of them did finish in the top 10.


As we can see, the model has been pretty well calibrated the past two years. Generally speaking, athletes with a low probability of finishing in the top 10 don't finish in the top 10. The model is also much more accurate than a dull (but perfectly calibrated) model that gives all athletes an equal chance of finishing top 10: my model's mean square error was 11.6% vs. 17.1% for the equal chance model.

But of course, my model is not perfect. And one area where it could be skewed is in how it accounts for (or rather, does not account for) past experience. If past experience is an advantage, then my model is understating (to some degree) the chances for returning Games athletes and overstating (to some degree) the chances for first-timers.

Which brings us back to the original question: Does past Games experience matter? To answer the question, I compiled the results from the Games and Regionals from 2011-2013 and tagged all athletes with Games experience prior to the year of competition (I went all the way back to 2007 to see if athletes had past experience). 

The simplistic way of looking at this is to compare the finishes of athletes with prior experience compared with first-timers. Looking at things this way, we find that returning athletes do finish approximately 8 spots higher than new athletes on average (18.3 vs. 27.8). However, this could simply be due to the fact that the returning athletes are just flat-out better, and their experience had nothing to do with their Games performance.

What we should do to account for this is compare Games performances to Regionals performances in the same year (using the cross-Regional rankings, adjusted for week of competition). In general, we expect athletes who fare better at the Regionals to perform better at the Games. So if Games experience is a factor, the returning competitors should perform better at the Games than their Regionals results would indicate. When we look at things this way, we see that returning competitors do indeed improve their placing by approximately 0.6 spots from Regionals, while new competitors dropped by approximately 0.8 spots in from Regionals.

Unfortunately, there is a still a problem with this comparison. Although Regionals performances are a good indicator of Games performance, there is still a tendency of athletes to regress towards the mean  in general. That is, athletes who finish near the top at Regionals don't tend to improve their placement at the Games, while athletes near the bottom at Regionals tend to improve slightly on average. Part of this is due to the fact that if you finish near the top at Regionals, there is basically nowhere to go but down (and the reverse is true for the athletes at the bottom of the Regional standings).

So to be fair, we need to compare returning athletes with first-timers who had similar Regional placements. Since we don't have a huge sample, I split the rankings into buckets of 10. Within each bucket, I found the average Regionals and Games placements of returning athletes and first-timers, as well as the average improvement or decline. The results are presented in the chart below.


For every level of competitors except those near the bottom, the athletes who had past Games experience showed an advantage at the Games over first-timers with similar Regional placements. While we can see that there is significant variation in how much this advantage is worth, if I had to put a number on it, I'd say that Games experience is worth between 4-5 spots at the Games. Remember, my current predictions assume all athletes have equal experience, so a reasonable adjustment might be to improve the average rank of past competitors by ~2 spots and drop the rank of new competitors by ~2 spots.

This analysis is, of course, not precise. It is likely that experience matters more for veterans like Rich Froning and Jason Khalipa than it does for someone who has only competed once at the Games before. Moreover, some veteran Games athletes have consistently struggled to match their Regionals performances, while newcomers have overachieved in the past (see Garrett Fisher last year).

I don't plan to adjust my predictions this year, for a few reasons:
  • There is not a simple solution of how to implement this factor into the model framework I have set up;
  • I feel that the predictions are still pretty reasonable on the whole (based on the calibration seen in the past two years);
  • For the Pick 'Em contest, I committed not to change those predictions for reasons of fairness. I suppose I could produce a second set of predictions, but I think that's just creating unnecessary confusion. Anyone entering the contest is welcome to use the information in this post to their advantage if they wish.
Still, when it comes time to make my predictions again next year, I'm going to try to find a way to account for past Games experience. The model still won't be perfect, but hopefully it will be even more useful.

No comments:

Post a Comment