Follow me on Twitter!

Monday, July 15, 2013

So Who CAN Win the 2013 CrossFit Games - Methodology

In some ways, it seems like making predictions about the CrossFit Games should be relatively easy. After all, we have plenty of data to make direct comparisons between athletes. So far this season, these athletes all have completed the same 13 events. By this point, it seems like the cream should have risen to the top. Remember, the 2007 Games had only three events and the 2008 Games had only 4 events. With 13 events already, shouldn't the champion be fairly clear?

Of course, what we've seen is that competition has gotten much tighter in recent years. In 2008, there were only a handful of athletes of the caliber to even think about contending for the title. This year, if we compare the Games athletes, 14 different male athletes and 14 different female athletes have finished in the top 3 of at least one workout. So nearly a third of the field has shown the capability to be the close to the best in the world on a given workout.

So, obviously, the complicating factor with predicting the Games is that we don't know what the workouts will be. And even if we knew what they were (in fact, we likely will know some of the events within the next week or so), we can pretty much guarantee that they won't match any of the 13 workouts we've seen thus far. So what can we do?

Last year, I estimated the odds of each athlete winning the Games by randomly selecting 10 events from among the Regional and Open events that had occurred. As I looked back on that methodology, I noticed that it really only gave a small number of athletes a chance at winning or even placing in the top 3. The reason is that I implicitly assumed that each event of the Games would exactly mirror one of the prior events of the season. After some investigation, it turned out that most of the events from the Games did not match any one event from the Regionals or Open particularly closely.
  • Of the 20 events from the 2012 Games prior to cuts (10 men's events + 10 women's events), I looked at the correlation between that event and each Regional and Open event.
  • For each of those Games events, I took the maximum of those correlations.
  • 3 of 20 were at least 60% correlated with one Regional or Open event.
  • 10 of 20 were at least 50% correlated with one Regional or Open event.
  • 5 of 20 were not more than 30% correlated with any Regional or Open event.
Certainly, the reason for this variation is due largely to the design of the workouts in the Games vs. the Regionals and the Open. But I also think part of it is due to the fact that the Games are simply a different competition than the Regionals and the Open. Athletes come in at varying levels of health, with varying levels of nerves, and so even if the events were identical to regionals, I think we'd have different results.

Either way, I felt that in estimating the chances for each athlete this year, I needed to account for how much variation we have seen from the Regionals/Open to the Games. I needed to simulate the Games using results that weren't identical to the Regionals/Open but were correlated. I also wanted to rely primarily on the Regional results, since we know that some top athletes tend to coast through the Open while others take it a bit more seriously. Still, I did include the Open results to a lesser extent, because I don't think it's fair to ignore it entirely as it provides insight into how athletes fare in events that are generally lighter than what we see at the Regional level.

Additionally, we know that historically, the Games has typically included at least one extremely long event (Pendleton 2, for instance). This event is generally very loosely correlated with anything at Regionals or in the Open. But, we can assume that athletes who did well on the "long" event the prior year will likely do well on the long event this year this year.

So I set up a simulation of 15 events, assuming no cuts (all athletes compete in all 15 events). Here is a description of how each event was simulated:
  • For 12 events, I randomly chose one of the Regional events to be the "base" event.
  • I started with the results (not the placement, the actual score) from that base event, then "shook up" those results enough so we'd get about new rankings that were roughly 50% correlated to the base event.
    • To "shake up" the original results, I adjusted each athlete's original result randomly up or down. Exactly how much I allowed the result to vary depended on how much variation was involved in that event to begin with. So if Regional Event 4 was the base event, I might let the scores vary by 3 minutes, but if Regional Event 1 was the base event, they might vary by only 1 minute.
    • I did testing in advance to see how much I needed to vary each individual's score to achieve about 50% correlation. It turned out to be about +/- 2.5 standard deviations. So each athlete's score could move from his/her original score by as much as 3 standard deviations in each direction.
    • The athletes scoring well in the base event still have an advantage, but we allow things to shift around a bit.
  • For 2 events, I used the same process, but I randomly chose one of the Open events to be the "base" event.
  • For 1 event, I used the Pendleton 2 results from 2012 as the "base" event. For athletes who didn't compete in the Games last year, they were assigned a totally random result.
    • Athletes who did well last year have an advantage, but I did "shake up" the results a bit in each simulation.
    • Keep in mind that finishing poorly in Pendleton 2 last year was considered worse than not competing at all.
    • I made two exceptions: Josh Bridges and Sam Briggs missed last year due to an injury but did extremely well on the long beach event in 2011. I treated them as if they had competed in Pendleton 2 and finished very highly.
  • These events were simulated 5,000 times. The Games Scoring table was applied to determine the final rankings after each simulation.
Before applying this method to this year's field, I went back to see what type of estimates I would have gotten last year with this method. Some notes from those simulations:
  • I looked at how good a job I did at predicting which athletes would finish in the top 10. The mean square error (MSE) of my model would have been 0.121 for women and 0.104 for men. Had I simply assumed the top 10 from Regionals would be top 10 at the Games with 100% probability, the MSE would have been 0.130 for men and 0.133 for women. If I had instead assumed all athletes had an equal shot at finishing in the top 10, the MSE would have been 0.254 for men and 0.259 for women. So I did have an improvement over those naive estimates.
  • On the men's side, I would have given Rich Froning a 45% chance of winning, with Dan Bailey having the next-best chance at 30%. For the women, I would have given Julie Foucher a 53% chance of winning and Annie Thorisdottir a 22% chance of winning (remember, Foucher was the pick for many in the community last year, including me). No one else would have had more than a 7% chance on the women's side.
  • For podium spots, I would have given Froning an 86% chance, Chan a 4% chance and Kasperbauer a 2% chance. For women, I would have given Thorisdottir a 61% chance, Foucher an 84% chance and Fortunato a 3% chance. While it would be nice to have given Chan, Kasperbauer and Fortunato a better shot, I don't recall many people talking these athletes up prior to the Games. None had ever reached the podium before, although Chan had been close.
My goal was to strike a balance between confidence in the favorites (like Froning) and allowing enough variation so that relative unknowns (like Fortunato) still have a shot. This largely comes down to how much I shook up those original results. The less I shook up the original results, the more confident I would have been that Froning would have won last year. But I also would have given someone like Matt Chan virtually no shot, because his Regional performance simply wasn't that strong compared to the other heavy hitters. But if I shook up the original results too much, things just got muddy and I allowed everyone to have a fairly even chance to win, which doesn't seem realistic either.

No model is going to be perfect with this many unknowns. Sure, you could argue that I am not taking into account other factors, like the advantage that Games "veterans" could have. But I would counter by pointing out that last year, Fortunato was a first-time competitor and Kasperbauer hadn't competed individually since 2009, and they both fared well. Other athletes like Neil Maddox simply didn't perform well at the Games despite experience at the Games and great performances at Regionals. A lot of it simply has to do with what comes out of the hopper, how each athlete manages the pressure and what little breaks go for or against each athlete throughout the course of the weekend. But at the end of the day, the fact is that the athletes who do well at Regionals and the Open generally fare well at the Games, and that's why I am using those results as the basis for my estimates.

With the methodology and assumptions out of the way, move ahead to my next post for the picks for the 2013 Games!

No comments:

Post a Comment