Image Classification for Swimming Videos with ML and a Coach’s Eye

breaststroke swimming

A friend once told me you don’t need to be a great swimmer to be an amazing coach, but it sure does help. The athletic experience is personal, and coaching is about sharing. So I understood a great deal about swimming, minus breaststroke swimming with a coach’s eye for detail.

But scaling knowledge comes once you better understand the results you are searching for in the process. And more deserve access to that information without overly demanding time from athlete or coach.

10000 hours of ‘paying attention’

In the beginning, I never knew what to look for until someone else pointed it out. By the end, I could hear it. Like the difference between a curve and a fastball. So why not let a machine watch some swimming. And see what it can learn from a coach’s eye?

Machine learning TS js

The purpose of this project was to develop software that detects swimming strokes: freestyle, breaststroke, backstroke, and butterfly. The ultimate goal is to gauge whether neural network-based image classification can be a useful tool for coaching competitive swimmers.

The image classification model was developed using Tensorflow and neural networks from the Inception-v3 image classification model. The model outputs confidence scores between 0 and 1, as shown at right, which indicates how confident the model is with its inference. A score of 1.0 represents absolute certainty.

Training Data

7412 images in 4 categories were provided for training the model. The categories were named backstroke (1228 images), breaststroke (1580 images), butterfly (2447 images), and freestyle (2157 images).

Model Testing and Validation

In the figure shown below, accuracy is shown as the percentage of correctly-labeled images on a randomly-selected group of images not used for training. Typically we like to see accuracy values between 85% and 99%.

In this case, some accuracies were in that range. However, be aware that the model will not be this accurate if it is applied to images that are a lot different from the images used for training the model.

Bring Some Wonder to the Pool Deck with your Camera Lens

Interactivity helps bridge the gap between a coach’s eye and an athlete’s desire to technically improve. Because sometimes coaching can feel like learning a foreign language. And athletes can focus on outcomes too often.

So first, we filmed to review later, then filmed with a “hacked together” DVR + security camera, now we bring motion tracking with image classification.

Data in sports – yes, please. Firsthand experience with failure is one of the most powerful ways to learn. As a coach, an athlete, or a machine we can all try out “citius, altius, fortius” as a means to challenge communication as well.