2 Andrea Apicella, Francesco Isgr`o, Andrea Pollastro, and Roberto Prevete
of applications [1]. In particular, BCI has a growing interest in the scientific
community thanks to its implication in several medical fields, such as assisting
[2], monitoring [3], enhancing [4], or diagnosing patients’ emotional or physical
states [5,6]. Current literature reports that patients subjected to BCI-based Re-
habilitation methods show benefit and improvement in their injured capacities
[7]. Currently, several methods exist to allow the interaction between humans
and machines. In particular, several proposals for BCI methods based on Elec-
troencephalographic (EEG) signals are made. This is because measuring and
monitoring the brain’s electrical activity can provide important information re-
lated to the brain’s physiological, functional, and pathological status. EEG sig-
nals are particularly suitable for this aim thanks to their essential qualities, such
as non-invasiveness and high temporal resolution.
Modern Machine Learning (ML) methods such as Deep Neural Networks
(DNNs) are mainly used to process acquired EEG signals for several tasks, such
as emotion classification, engagement and attention detection. In general, a su-
pervised ML model learns from human classified data to generalise to new un-
known data. The standard pipeline to develop an ML system consists in i) data
acquisition, ii) data preprocessing, iii) feature extraction, iv) model learning v)
model validation. However, the performance obtained using classical ML meth-
ods in EEG-related tasks is often poor [8]. This is mainly because the EEG signal
is highly non-stationary [9], substantial differences across the EEG acquired at
different times or from different subjects exist, even with the same affect felt.
More in detail, the starting hypothesis of the traditional ML methods states
that all the used data, whether used in the training process or not, come from
the same probability distribution. This assumption results are not always veri-
fied in the case of EEG signals. In the ML literature, this is an instance of the
Dataset Shift problem [10]. In a nutshell, a Dataset Shift arises when the start-
ing ML assumption is not valid, so the distribution of the training data differs
from the data distribution used outside of the training stage. In other words, a
model trained on a set of EEG data acquired from a given subject at a specific
time (or during a specific session) should not work as expected in classifying
EEG signals acquired from a different subject at different times. In other words,
the model has poor generalisation performance. A first attempt to mitigate this
problem is training specific models for each subject (Subject-Dependent models)
to reduce the performance gap due to using the same ML system on different
users. However, non-stationary signal problems related to the different user’s
physical and psychological conditions at different times remain. Furthermore, a
Subject-Dependent model is valid only for the subject providing training data
acquisition, making these models expensive and not very versatile and uncom-
fortable to the user, who will be tied to initial acquisition sessions before it can
actually use the system for real classifications.
For these reasons, newer studies [11,12] tried to overcome these limits given
by Dataset Shift, taking into account the difference between the data distribu-
tion probabilities (domains) acquired in different times and for different sub-
jects. Several proposed solutions are based on Transfer Learning (TL) [13], a set