Monday, July 9, 2012

2012 CrossFit Games - Who Ya Got?

OK, it's time to get down to business and predict some friggin' results, people. For background on how we got here, see the previous post ("The Method to the Madness that is Predicting the CrossFit Games").

First, for the ladies. The model I used was as follows: Games Points = 0.82 * Regional Points + 0.48 * Open Points + 123. Remember, points are calculated using the current Games scoring system (all events using the 100-point scale). For those of you saying, "That's boring - why did you only include those two variables and not, say, age or prior Games experience?" read the previous post. Now, if you're mentally prepared for the results, the table is below.


That, my friends, is pretty much a straight-up "pick 'em" at the top. Would I question anyone whatsoever for picking Annie (or for that matter, any of the top 6 women)? Absolutely not. BUT I AM NOT PETER KING, AND I STICK BY MY PREDICTIONS: Julie Foucher will win the CrossFit Games. Don't let me down, Julie.

Now, for the men. Things were a bit more complex (and hopefully more accurate for that reason). The model here is: Games Points = 0.56 * Regional Points + 0.41 * Open Points + 162 * Prior Experience - 15.68 * Age Beyond 26 + 174. Prior Games experience is a 1 for "Yes" and 0 for "No." So for the men, you should do well if you are young, have competed at the Games before and did well at Regionals and the Open. Drum roll please...


Folks, there is just no way to take the data we have from this year and NOT predict Rich Froning to win the CrossFit Games. The man won the Open and had the top Regional performance overall, he's only 24 AND HE WON THE GAMES LAST YEAR. Do other guys have a shot? Certainly - see my post "So who CAN win the CrossFit Games?" But Froning is absolutely the man to beat. 

You'll notice that the top projection for a Games rookie is Kenneth Leverich at 22. I would guess that we'll see a rookie finish higher than that, but I think it's unlikely we'll have another Josh Bridges come in and finish on the podium on his first try. 

So there you have it. Are these predictions going to materialize perfectly? Of course not. Are some of the predictions debatable? Yes (I mean Patrick Burke at 33 seems pretty low even to me). The R-squared for the women's model, using last year's results, was 56%, so that's basically saying 44% of the variance is not explained by this model. For men, it's about 66%. That tells you how difficult it is to predict the Games.

But on the whole, given the data we have, this is what I'm going with. Think you can do better? By all means, post your top 10 or even your top 50 and we'll see how it all shakes out.

ENJOY THE GAMES, EVERYONE!





3 comments:

  1. Loving your posts Anders! It's possible I spent an entire afternoon at work reading them all. I don't suppose you have a set of open data including athlete age? With the recent announcement of a new Masters category I wish to have a peak at who in that crowd might make it to the games and unfortunately I'm hopeless at scraping it myself.

    ReplyDelete
  2. Jen,

    Thanks a bunch for reading. The Open data I got was downloaded from here: http://media.jsza.com/CFOpen2012-wk5.zip. It's in .csv format, which was a bit of pain in Excel (well for my Excel on a Mac at least) but if you're using R or some other statistical software it shouldn't be too bad. As I mentioned in my post about the scoring system, Jeff King did all the heavy lifting getting this data pulled down long before I got started on my blog, so all the credit goes to him. I'm pretty sure this has age on it but I can't remember off the top of my head.

    ReplyDelete
  3. And now that I am looking further, I didn't make the connection at first but I found that data through your site actually (CrossFit Open Analysis). So I'd assume you already were aware of that data source - does it not have age? Or maybe I'm misinterpreting your question.

    ReplyDelete