BlanketSet - A clinical real-world in-bed action
recognition and qualitative semi-synchronised
MoCap dataset
Jo˜
ao Carmona∗†, Tam´
as Kar´
acsony∗†‡, Jo˜
ao Paulo Silva Cunha∗† Senior Member, IEEE
∗Center for Biomedical Engineering Research, INESC TEC, Porto, Portugal
†Faculty of Engineering (FEUP), University of Porto, Porto, Portugal
‡Robotics Institute, Carnegie Mellon University, Pittsburgh, PA, USA
Abstract—Clinical in-bed video-based human motion analysis
is a very relevant computer vision topic for several relevant
biomedical applications. Nevertheless, the main public large
datasets (e.g. ImageNet or 3DPW) used for deep learning
approaches lack annotated examples for these clinical scenarios.
To address this issue, we introduce BlanketSet, an RGB-IR-
D action recognition dataset of sequences performed in a
hospital bed. This dataset has the potential to help bridge the
improvements attained in more general large datasets to these
clinical scenarios. Information on how to access the dataset is
available at rdm.inesctec.pt/dataset/nis-2022-004.
Index Terms—Action recognition dataset, RGB-IR-D Video,
Motion Capture, Human Pose Estimation, Epilepsy monitoring,
Sleep analysis
I. INTRODUCTION
Human motion analysis in videos has recently seen drastic
improvements due to advances in Deep Learning (DL).
However, almost all the research in this area was focused on
the most general context of people standing up without any
occlusions. This, paired with the difficulty of generalizing
DL models beyond the domains of their training datasets,
means that human motion analysis for patients in bed is
sorely lacking when compared with more common scenarios.
In order to bridge this domain gap, we introduce Blan-
ketSet, an action recognition dataset of people in a hospital
bed which consists of 405 RGB-IR-D recordings of 14
participants performing 8 different movement sequences.
Each recording was repeated 3 times with different levels
of blanket occlusion, so it can be used not just for action
recognition, but also for qualitative evaluation of human
motion analysis systems. Even though there are no ground
truth poses or body shapes, the actions are carried out in
a semi-synchronized manner, as described in (III-C). As an
example of this second use case, in this work we used
BlanketSet to evaluate the real-world performance of the
pipeline implemented in [1] and found that BlanketGen
improved the performance of a DL human pose estimation
system with statistical significance when full-body blanket
occlusions were present in the real world.
This work is financed by National Funds through the Portuguese funding
agency, FCT - Fundac¸˜
ao para a Ciˆ
encia e a Tecnologia, within project
LA/P/0063/2020 as well as under the scope of the CMU Portugal (Ref
PRT/BD/152202/2021).
II. RELATED WORK
In-bed movement monitoring is done in different contexts,
the most relevant ones being epileptic seizure analysis and
sleep analysis.
In the area of automatic epileptic seizure classification,
there has been significant research, in [2] and [3] a system
was set up to record RGB-IR-D videos synchronized with
electroencephalography data of patients staying at the Uni-
versity of Munich Epilepsy Monitoring Unit (EMU) and then
have them manually labeled by clinicians. IR data acquired
with that system was used in [4] and [5] to explore deep
learning action recognition at EMUs. Data recorded with this
system is very promising for future research in this area but
is not publicly available, which limits its accessibility.
In the area of sleep analysis, the use of data collected
from cameras positioned above the bed has been explored
as a low-cost non-invasive approach with promising results
[6]–[8]. However, the only image dataset of people lying in
beds that is publicly available is SLP [9], which contains
images of people in static positions along with 2D position
ground truth; it is an useful resource but lacks the temporal
dimension that is very relevant in contexts with considerable
movement such as the analysis of epileptic seizures.
Image and video-based DL systems have been shown to
work exceptionally well in tasks related to human motion
recognition and classification in general use cases [10]–[12],
therefore it is reasonable to expect that they would also be
able to perform well in the more specific use cases of sleep
analysis and epileptic seizure analysis. As with everything
related to deep learning, however, large amounts of data are
required and, due to the lack of publicly available datasets
in these areas, each separate research effort has also had to
include the acquisition of its own dataset which significantly
hampers the efficiency of the research. The acquisition and
publication of datasets such as [13]–[17] were instrumental to
the incredible results attained in the more general use cases.
There have also been research efforts into utilizing these
broader datasets in epileptic seizure analysis: [4] employed
transfer learning from the Kinetics-400 [16] and ImageNet
[17] datasets to discriminate between two classes of epileptic
seizures, and BlanketGen [1] augmented the 3DPW dataset
arXiv:2210.03600v3 [cs.CV] 19 Mar 2023