
2 of 19
machine learning algorithms will evaluate a performed movement exercise. In the first step,
these algorithms are to classify the movement quality on a scale. In a future step, the patient
should additionally receive feedback that informs him about the errors in his execution. In this
publication we would like to present the developments made for the first step.
We want to explore this topic, using screening exercises from Functional Movement Screen-
ing (FMS) [
1
,
2
]. FMS is an assessment system consisting of seven exercises that can be used to
systematically determine movement restrictions or weaknesses in the human musculoskeletal
system. Each exercise is assigned a score of 3 (perfect execution), 2 (complete execution with
compensation movements), 1 (incomplete execution, even with compensation movements)
and 0 (pain occured). For each exercise there is a well-defined list of movement characteristics
that must be fulfilled in order to receive a certain score. FMS is chosen because it is widely
used and one of the most popular screening systems in the field of sports physiotherapy. In
addition, Cook’s system also convinced with high interrater and intrarater reliability values
[
3
]. This circumstance is especially interesting for machine learning applications that depend
on unambiguous label information. Hart et al. and Ordonez et al. have made suggestions for
a suitable dataset based on existing approaches to automated exercise evaluation [
4
,
5
]. The
dataset should contain recordings from an appreciably large number of subjects. It should con-
tain many different variations of exercise execution both incorrect and correct. These variants
should not be staged. To test the applicability of a methodology to multiple exercise types,
the dataset should contain different exercises. Based on these recommendations, we recorded,
processed, and labeled an FMS dataset with 20 subjects of 4 different exercises.
Studies in the field of exercise evaluation rely on different measurement systems. Depth
cameras [
6
–
10
] or RGB cameras in combination with human pose estimation [
11
,
12
] are often
used. In order to track the entire body posture, several cameras are required at different
positions in the training room. The training room requires a comparatively large empty area
for this. In addition, these cameras must be calibrated to a common coordinate system, before
a measurement is taken, in order to provide usable results. This process is time-consuming and
requires considerable computer resources and experience. These demands on training room,
prior knowledge, and resources cannot be expected of a user. A suitable alternative to these
camera-based systems are inertial measurement units (IMUs). These units record acceleration,
angular velocity and magnetic field strength on three axis. Attached to the body segments,
these measuring units can be used to record the kinematic movement of a human being. In
addition, they are comparatively inexpensive, do not require additional space and do not cause
additional work for the user during operation in an appropriately designed measurement
system.
Based on this IMU data we want to automate the evaluation of FMS exercises. Current
studies mainly rely on classical machine learning methods such as decision tree [
13
–
15
], random
forest [
9
,
16
] or support vector machines [
14
,
17
]. In contrast, very few studies e.g. [
18
] to date
use deep learning methods in combination with IMU data and exercise evaluation, as also noted
in systematic reviews on this topic [
4
,
5
]. This is particularly interesting, as these techniques
are already widely used in related topics. Human Activity Recognition (HAR), for example,
has been performed with IMU data and deep learning methods for quite some time [
19
]. Deep
Learning methods offer some advantages over classical Machine Learning methods, such as
automatic feature engineering and the option to apply Transfer Learning methods. Ordonez
et al. have shown that using a combined structure of CNNs and LSTMs it is possible to
distinguish between different activities using IMU data [
20
]. Lee et al. have already compared
the performance of a random forest approach with an approach based on a combination of a
CNN and a LSTM on an exercise evaluation task [
18
]. When classifying a squat into different
performance variants based on the data of five IMUs, the deep learning approach (accuracy:
91.7%) achieved significantly better results than the classical approach (accuracy: 75.4%).