Connect with Us:  

The Swim Scout: A Statistical Analysis to Swimming

Jan 20, 2014  - Elliot Meena

Editors Note: SwimNews can think of no better new team member to honour our founder Nick Thierry's legacy of stats than Elliot Meena, The Swim Scout. In his first piece for SwimNews, Elliot breaks down what many believe is the swim of the century, Jason Lezak's anchor leg on the men's 4x100 free relay in Beijing.

A Statistical Analysis Approach to Swimming

For as long as I can remember, I have been a swimnerd. Heatsheets, record books, results, videos, you name it and I studied it. It never mattered when or where a time was swum because it was, and always will be, a race against the clock. And while some might say that makes it easy, I would say it makes it simple.

As all swimmers know, the clock does not lie. It has no emotion. It’s not loyal to any team or country or conference, nor can it be influenced by the reaction of a coach or player. It is one of the most accurate measurements of success, capable of stripping away much of the noise that could affect the outcome of other sports, and leaves everyone in agreement on knowing what needs to be done in order to win. Simply touch the wall first (1).

But the story that the clock does not tell is the one about what comprised the swim in the first place, and I am not talking about splits. True, a best time is a sum of its splits, but the splits are the sum of strokes and kicks, and those strokes and kicks are being conducted by a male or a female that could have a height differential of up to four feet and a weight differential of up to 100 lbs. Catch my drift? An efficient race is an effective race and that is what statistical analysis of swimming can tell you. How to tailor your race to exactly the type of swimmer you are (2).

Heck, whether you saw (or read) Moneyball or not, you’ve undoubtedly heard of Brad Pitt and Jonah Hill so the phrase “statistical analysis” is not foreign to you. But if you were anything like me when I was a swimmer, all you did was count, and never measure. But even the counting had to start somewhere, and for me it was watching Nate Dusing tie the 100-yard butterfly National High School record at the Kentucky State High School Championships in January of 1997.

I remember this race for a number or reasons, one of which was this was the first time I saw how fast underwater kicking could be (no offense to Pankratov in 96), but most importantly because it was the first time I caught myself counting strokes, without even thinking about doing it.

Dusing, a Cincinnati Marlin that attended Covington Catholic in Northern Kentucky, was so quick underwater that he only needed one stroke the first lap and two the second to take his 100-fly out in 21.69. I’ll say that again, 21.69! For those of you who do not remember, the Junior National qualifying time in the 50-yard freestyle back then was 21.69, and this guy still had two more laps to go. Admittedly, I stopped counting the second 50 as I was still in shock from his first, but seeing 47.10 on the scoreboard certainly brought me back to reality.

From that day until my Senior Day at the University of Florida, I counted my strokes because I couldn’t fathom building a race strategy without knowing.

However, it wasn’t until after I retired and started to view the sport as a spectator that I began to see all the ways that individual units of measurement could be compared against one another to show potential indicators of success. One of the first races that caught my eye from a statistical analysis standpoint was the Men's 4x100 Freestyle Relay Final at the 2008 Beijing Olympics, a race I was fortunate enough to witness in person.

Now, like many others, I thought anchor Jason Lezak’s chances of winning were right up there with Lloyd Christmas’ winning over Mary Samsonite, but I do remember thinking that France's Alain Bernard rushed the first 50, and that could leave the door open for Lezak.

Now we all know what happened next, but I am going to tell you anyways. Bernard faded and Captain America soared to victory by running down a former world record holder, in what can arguably be one of the greatest single Olympic performances ever. But what most of us don’t know is how it exactly happened and, more interestingly, how easily it could have gone the other way. Even though Lezak was technically within reach when he dove in, Bernard, with clear water on both sides, should have outsplit him by ~0.40.

*The following exhibits do not account for any differential in textile vs. non-textile suits

Exhibit 1 Men’s 400 Freestyle Relay Anchor Leg Splits, Beijing Olympics, 2008 (3)

Jason Lezak
Splits: 21.50/24.56 = 46.06 (14% increase, 50-over-50)
Stroke Count: 29/34 = 63 (17% increase, 50-over-50)

Alain Bernard

Splits: 21.27/25.46 = 46.73 (20% increase, 50-over-50)
Stroke Count: 34/42 = 76 (24% increase, 50-over-50)

Cool, huh? Actually, not really. All this tells us is that Bernard is a splash and dash sprinter, whereas Lezak is a closer. It doesn’t tell us anything about how else the race could have ended. In order to see that, we need to compare another race where the two swimmers were involved, and what better than the Men’s 100-Meter Individual Final of the same Olympiad.

Exhibit 2 Men’s 100 Freestyle Final, Beijing Olympics, 2008

Jason Lezak
Third Place Splits: 22.86/24.81 = 47.67 (9% increase, 50 over 50)
Stroke Count: 29/37 = 66 (28% increase, 50 over 50)

Alain Bernard
First Place Splits: 22.53/24.68 = 47.21 (10% increase, 50 over 50)
Stroke Count: 34/41 = 75 (21% increase, 50 over 50)

This comparable analysis shows us that, on neutral circumstances, Alain Bernard is 0.46 seconds faster than Jason Lezak and he accomplishes that by swimming a controlled first 50 in 34 strokes so that he is able to bring home his second 50 within 10% of his first. Bernard lost by 0.08 seconds. Had he maintained the same race strategy as his individual race, he could have afforded to go out an entire second (well, 0.99 seconds to be exact) slower, and still would have been able to hold off Lezak.

And, just for fun, let’s look at how Lezak’s swim stacks up in comparison to others.

Exhibit 3  Establishing a Benchmark to Compare Time and Length

Lezak split 2.50% faster than the existing world record of 47.24, which was set by Eamon Sullivan from Australia with his lead-off leg about 100 seconds earlier.

The following are a few select events and their adjusted time by lower the existing record by 2.50%:

100 Butterfly Male World Record = 48.58 (Existing = 49.82, Michael Phelps)

100 Backstroke Male World Record = 50.64 (Existing = 51.94, Aaron Peirsol)

100 Backstroke Female World Record = 56.67 (Existing = 58.12, Gemma Spofforth)

800 Freestyle Female World Record = 8:01.52 (Existing = 8:13.86, Katie Ledecky)

50 Freestyle NCAA Male Division 1 Record = 18.01 (Existing = 18.47, Cesar Cielo)

200 Freestyle NCAA Male Division 1 Record = 1:28.92 (Existing = 1:31.20, Simon Burnett)

100 Butterfly NCAA Female Division 1 Record = 48.76 (Existing = 50.01, Natalie Coughlin)

200 Breaststroke NCAA Female Division 1 Record = 2:01.37 (Existing = 2:04.48, Breeja Larson)

Now I know you could argue his reaction time (0.06, if I recall) should be accounted for versus a flat start (call it 0.72 to be fair), but that in no way should negate the praise this swim rewards, it only argues for the need to dive deeper (pun intended) into the elements of the swimming equation.

Hey, we may never see another 46.06 relay split again. But the numbers support that we will witness someone dropping 2.50% off a record again. I just hope that if it comes with similar circumstances surrounding the outcome of that race in Beijing, that I can be there to witness it.

1) The clock is the first part of the Statistical Analysis equation. In the beginning, the majority of my articles and case studies will involve analysis involving the clock as a variable.
2) The demographic is the second part of Statistical Analysis equation. Case studies will incorporate more demographic data, such as physical: height, weight, wingspans; and race specific: stroke count, u/w kicks, as it is gathered.
3) Source: NBC Olympics