Monday, June 11, 2012

A fairer Regional comparison

Welcome to CFG Analysis. I'll keep the intro short today (perhaps I'll save a more long-winded version for another day) and just say that I'm a CrossFit athlete as well as a bit of a math nerd (I get paid to do it, so at least someone thinks I have some credibility). I've been intrigued in the statistical side of CrossFit since day 1, the constant and meticulous record-keeping for every workout and the frequent monitoring of the progress over days, months and years. The CrossFit Games has taken this to another level.

I got the idea to start this blog after looking at the 2012 Regional Comparison featured on the CrossFit Games Update Show last week (http://scores.loadtimerounds.com/static/g12/regional-comparison.html?file=regional-women.json - great work that I truly appreciate). The comparison was interesting for sure, but in my opinion, it didn't tell the whole story. What was missing, at least in my mind, was some way to account for the advantage certain regions have over others because of the staggered schedule. Anecdotally, we heard accounts of athletes shaving minutes off their times in certain workouts over the course of several weeks of preparation, and as an athlete myself, I knew that the additional time for the later regions undoubtedly improved those scores. But to what degree?

I began my analysis by taking the results for the top 16 finishers from each region and comparing the averages across regions. A couple of notes here - first, I limited this to the top 16 because a) these were the only athletes to complete Event 6 and b) this reduced the impact of outliers at the low end of the competition; second, I disregarded Asia, Africa and Latin America from this portion of the analysis because the level of competition in those regions is simply not on par with the rest of the world at this point.

Additionally, I accounted for the time cap by adding 10 seconds for each missed rep on events 2 and 6 and 5 seconds for each missed rep on event 4. I did this because the 1 second per rep that was initially on the data would skew any averages I did. In reality, a finish of 17:10 on Event 6 meant the athlete had 6 burpee-box jumps, 1 farmers carry and 3 muscle-ups left - this would take far more than 10 seconds (perhaps even more than the 100 seconds I assumed). Also, I made one final adjustment to the Canada East women's Event 6, adding 60 seconds to all scores, because the bars were loaded with only 205 pounds for the deadlifts instead of 225 (http://games.crossfit.com/video/event-summary-canada-east-womens-workout-6).

OK, at this point, some trends started to emerge. Below are the average weekly scores for each workout in the men's division.


Although not drastic in the chart, it is clear that for most of the workouts, particularly 1-4, the times to tend to decrease (keep in mind that Event 5 is the only one where a higher score is an improvement). For instance, the average time in Event 1 for week 1 (for my sample) was 4:51, but by week 5 that average had dropped to 4:07. Initially, it seemed that a simple linear regression on each event might give us an idea of the effect of the additional weeks on the scores.

However, there were a couple of issues with this. First, the pattern is likely not truly linear. It is not reasonable to assume that athletes continue to cut time on a workout at the same rate for weeks on end. More likely, the improvement happens more rapidly early on and diminish as the weeks progress - otherwise, we'd be seeing times of 0:00 on these workouts eventually. A trickier issue, however, was the fact that the athletes comprising the regions themselves were not all equal. Clearly, I had to account for this in order to get a meaningful result.

My solution was perhaps not optimal, but it was relatively simple, not too time-consuming, and in my opinion, go the job done. For each region, I looked at the number of the top 16 finishers who were in the top 180 Worldwide in the open. This gave me some idea of the strength of each region, and the results generally jived with what we believe to be true: the most difficult men's regions were Mid Atlantic (75% of top 16 in the top 180 in the Open), Southern California (69%), Central East (63%), NorCal (63%) and Northeast (63%). The weakest (excluding Asia, Africa and Latin America) were Canada West (13%) and Europe (19%). A plot of the Event 1 men's scores compared with my metric for regional strength is below.


Now that I had a metric to control for regional strength, I decided to do a multivariate linear regression on each event, with region strength and week of competition as my two independent variables and average score on the event as the dependent variable. Although I do not believe a linear model is ideal (as discussed above), it was the simplest option here and because we are not projecting out into the future, it wouldn't produce any unreasonable results. The results, I believe, were in line with what we would expect. For almost every event across the men's and women's competition, the coefficient for the region strength indicated that it was positively correlated with an improved average score (except men's event 3, where the effect was basically negligible). But what we really care about is the effect of the week of competition on the scores, because we can use that to get a fairer comparison between athletes in different regions. Below are the coefficients for week of competition (-1.00 means each week caused a drop of 0:01 in the time or an increase of 1 lb. in the snatch).

Men
Event 1 -10.50, Event 2 -11.86, Event 3 -10.78, Event 4 -18.04, Event 5 +.13, Event 6 -1.1

Women

Event 1 -10.66, Event 2 -10.75, Event 3 -16.11, Event 4 -7.90, Event 5 -.33, Event 6 +37.69



For both men and women, I assumed the true coefficients for Events 5 and 6 were 0. The women's result was especially curious, but the extremely wide range of results on that workout (often because women struggled at muscle-ups) probably threw things off. It's probably also fair to assume that no significant snatching strength can be gained in 4 weeks, especially for athletes of this caliber.

Now it was time to adjust each athlete's results to compensate for their week of competition. At this point, I decided that adding or subtracting a flat amount per week was not fair, because athletes are more likely to improve as a percentage of their starting time/score. For instance, Dan Bailey is not gaining 11 seconds per week when he has a Diane time of under 2:00. To fix this, I took my coefficients and divided them by the average score for each event for all competitors in my analysis. To apply these percentages, for each athlete, I would take the week of competition, subtract 3 (the midpoint) and multiply that number by the percentage impact per week for that event. That would be used to adjust the athletes score. The final percentage impacts I used were:


Men
Event 1 -4.4%, Event 2 -1.3%, Event 3 -3.7%, Event 4 -1.4%, Event 5 +0%, Event 6 +0%

Women
Event 1 -3.7%, Event 2 -1.2%, Event 3 -3.4%, Event 4 -0.5%, Event 5 +0%, Event 6 +0%

Whew... This is getting long. We'll finish up the analysis (with your final leaderboard) in the next post.

No comments:

Post a Comment