Today's post will go in a slightly different direction than the previous three. Instead of focusing on the athletes, I'd like to focus on the events themselves and try to address the question (in the context of CrossFit competitions): "Are certain events better than others?"
Intuitively, I think most would agree the answer is yes. An event like, say, "Fran"will almost certainly test a person's fitness better than something like a competition to hit the longest drive with a golf ball. One way to look at this is using the purely CrossFit definition of fitness: of the 10 general physical skills (http://library.crossfit.com/free/pdf/CFJ_Trial_04_2012.pdf), I would say Fran hits on just about all of them (except maybe accuracy, balance and agility), while a long-drive competition hits on maybe three (accuracy, coordination and speed).
But this gets tricky to prove which events are better than other simply using the 10 general physical skills. Just in the above example, there is some wiggle room in saying which skills are tested even by those two events. So let me propose another definition of what makes a good test of fitness: a good CrossFit event will provide a strong indication of an athlete's ability to perform well in a wide variety of OTHER tests. To help explain, consider this example:
My contention here (and most CrossFitters would probably agree) is that "Elizabeth" is a better test of fitness than either a 5K run or a max bench press. For one, it tests more of the 10 physical skills, but I believe it does a better job of indicating which athletes would perform well in a wide variety of other tests. In this example, I assumed that there is generally no correlation between running a 5K and bench pressing. As such, I assigned random rankings to those events. However, a person who is strong on "Elizabeth" will probably do fairly well at both a 5K and a max bench press. So in this example, the person who has the best rank combined on both the 5K and bench press also has the top rank on Elizabeth.
This is an extreme example, and obviously I have rigged it, but it gives us an idea of how we can get a feel for which other events are good tests of fitness. The way we can do this is to look at the correlation between an athlete's finish on one event and their combined finish on a variety of other events. In the above example, the correlation between "Elizabeth" and the combined ranks on the other two is 98%. The correlation for the 5K run vs. the other two is 0%, and for the bench press vs. the other two it is -8%. What this says is that the bench and the 5K don't tell us much as much as "Elizabeth." All we need to test is "Elizabeth," because that tells us just as much as testing all three events. Note that we are talking purely about TESTING fitness here, not training for it. It might be the case that the 5K is worthwhile in training, but not as much in testing.
So on this theoretical basis, I decided to look at the events we have seen thus far in 2012. Like I did in my first analysis, I limited the field to athletes who completed all 6 events at regionals, which gives us a sample of about 250 men and 250 women. I used my adjusted regional results (see first two posts), and my measure of how well an athlete did on each event was simply the rank*. For each event, I looked at the correlation between an athlete's rank and his/her combined rank on the other 10 events.
Let's start with a visual representation. Here is a scatter plot of the men's Open WOD 3 ranks (x-axis) vs. the combined ranks on all other events.
It is fairly clear from this plot that a better rank on Open Event 3 (further left) corresponds to better results on the other events. Now let's look at the same scatter plot for men's Regional WOD 1 ("Diane").
Whoa. While there does appear to be some weak correlation, it's pretty clear that the results from "Diane" don't do much to predict how an athlete will do on the other events. I don't find this particularly surprising. For years, we have seen otherwise solid athletes struggle with handstand push-ups, and at the elite level, there is just no way to make up any ground on the deadlifts at a weight as light as 225. So basically we are testing handstand push-ups, which do tell us something about an athlete's overall fitness, but not much - certainly not as much as we can learn by testing 18 minutes of box jumps, medium load push press and toes-to-bar.
So which how do the events stack up in terms of correlation**? Well, here are the results, with women first and men second:
Well, would you look at that? Men's Open Event 3 had the highest correlation and Men's Regional Event 1 had the lowest. You'd almost think I chose those two graphs on purpose. It is clear, though, that for both men and women, Open Event 3, Regional Event 4 and Regional Event 2 were strong predictors of success across the board, while Regional Event 3 and Open Event 1 did not tell us as much. Regional Event 1 did have a somewhat higher correlation for women than for men, possibly because the event was not so blazing fast.
We can see another trend from this chart as well: events with more movements tend to be better predictors of overall fitness. While this is not surprising, I think it is an important point. Single-modality events simply do not tell us as much about an athlete as a couplet, triplet or chipper***. I do not believe we should eliminate them from competitions for this reason, but I do think that there should be some consideration to weighting these events less heavily. The Games struck a good balance last year, in my opinion, by grouping the single-modality events together into "Skills Tests," which didn't put as much weight on any one of those movements. I think giving a max effort snatch or an extremely heavy dumbbell snatch the same weight as something like Regional Event 4 may not be appropriate (somewhere, Chris Spealler is nodding his head right now).
This is certainly not a topic with one absolute right or wrong answer. I would be very interested to see other opinions, not only on what I have done, but also on what defines a "good" CrossFit event.
*Note: I also looked at this another way, which was to give each athlete a score on each event that was equal to the percentage of work done relative to the overall top score/time. For now, I will ignore those results because they are generally the same as these.
**To give some perspective to what these correlations mean, you can square the values to get the "r-squared." Men's Open Event 3, for instance, has an r-squared of 56%, while Men's Regional Event 1 has an r-squared of 20%. One rough interpretation of the r-squared is that it tells you how much of the variance in the other events' scores is explained by the event we are using as a predictor. So Men's Open Event 3 explains about 56% of the variance in the other events' scores.
***Yes, Regional Event 5 actually had two movements, but the double-unders didn't have much of an impact other than as a tiebreaker. You could also argue that Regional Event 3 was basically only one movement, too, since the impact of the running was negligible for most athletes.