Deep learning for ECoG brain-computer interface end-to-end vs. hand-crafted features Maciej Sliwowski120000 000167441714 Matthieu Martin10000 000159548087

2025-05-06 0 0 1.63MB 16 页 10玖币
侵权投诉
Deep learning for ECoG brain-computer
interface: end-to-end vs. hand-crafted features
Maciej ´
Sliwowski1,2[0000000167441714], Matthieu Martin1[0000000159548087],
Antoine Souloumiac2, Pierre Blanchart2, and Tetiana
Aksenova1[0000000340072343]
1Univ. Grenoble Alpes, CEA, LETI, Clinatec, F-38000 Grenoble, France
2Universit´e Paris-Saclay, CEA, List, F-91120, Palaiseau, France
Abstract. In brain signal processing, deep learning (DL) models have
become commonly used. However, the performance gain from using end-
to-end DL models compared to conventional ML approaches is usually
significant but moderate, typically at the cost of increased computational
load and deteriorated explainability. The core idea behind deep learning
approaches is scaling the performance with bigger datasets. However,
brain signals are temporal data with a low signal-to-noise ratio, uncertain
labels, and nonstationary data in time. Those factors may influence the
training process and slow down the models’ performance improvement.
These factors’ influence may differ for end-to-end DL model and one
using hand-crafted features.
As not studied before, this paper compares models that use raw ECoG
signal and time-frequency features for BCI motor imagery decoding. We
investigate whether the current dataset size is a stronger limitation for
any models. Finally, obtained filters were compared to identify differences
between hand-crafted features and optimized with backpropagation. To
compare the effectiveness of both strategies, we used a multilayer per-
ceptron and a mix of convolutional and LSTM layers that were already
proved effective in this task. The analysis was performed on the long-term
clinical trial database (almost 600 minutes of recordings) of a tetraplegic
patient executing motor imagery tasks for 3D hand translation.
For a given dataset, the results showed that end-to-end training might
not be significantly better than the hand-crafted features-based model.
The performance gap is reduced with bigger datasets, but considering the
increased computational load, end-to-end training may not be profitable
for this application.
Keywords: deep learning ·ECoG ·brain-computer interfaces ·dataset
size ·motor imagery ·end-to-end
1 Introduction
In the last decade, deep learning (DL) models achieved extraordinary perfor-
mance in a variety of complex real-life tasks, e.g., computer vision [4], nat-
ural language processing [2], compared to previously developed models. This
arXiv:2210.02544v2 [eess.SP] 12 Oct 2022
2 M. ´
Sliwowski et al.
was possible mainly thanks to the improvements of data processing units and,
most importantly, increased dataset sizes [4]. Generally, in brain-computer in-
terfaces (BCI) research, access to large databases of brain signals is limited
due to the experimental and medical constraints as well as the immensity of
paradigms/hardware combinations. Given limited datasets, can we still train
end-to-end (E2E) DL models for the medical BCI application as effectively as
in computer vision?
In 2019, Roy et al. [12] reported that the number of studies classifying EEG
signals with deep learning using hand-crafted features (mainly frequency do-
main) and raw EEG signals (end-to-end) was similar. This indicates that de-
coding EEG from raw signals is indeed possible. However, in many articles,
researchers decided to use harder to design hand-crafted features. While end-
to-end models dominated computer vision, in brain signals processing, it is still
common to use features extracted as an input to the DL models. It is unclear
whether specific signal characteristics cause this, e.g., nonstationarity in time
making the creation of a homogeneous dataset impractical, low signal-to-noise
ratio complicating the optimization process and favoring overfitting, labels un-
certainty originating from human-in-the-loop experimental setup, or researchers’
bias toward solutions better understood and more explainable.
Most studies do not directly compare DL using end-to-end and hand-crafted
features approaches. Usually, DL architectures are compared with each other
and with an additional ’traditional’ ML pipeline, e.g., filter-bank common spatial
pattern (FBCSP) in [15], xDAWN and FBCSP in [5], SVM and FBCSP in [17].
In figure 1, we presented accuracy improvement of the best proposed DL model
compared to the ’traditional’ baseline for articles analyzed in [12] 3depending on
the recording time and the number of examples in the dataset. The gap between
performance improvement of DL compared to the ’traditional’ baseline increases
with the dataset size (except for the last points on the plot, which contain
significantly fewer studies). In the right plot, the difference between models using
raw EEG and frequency domain features increases which may exhibit a boost
of end-to-end models with access to bigger datasets compared to hand-crafted
features. As the proposed DL models are usually compared to the baseline,
the boost of end-to-end models cannot be clearly stated because the accuracy
difference depends strongly on the ’traditional’ baseline model performance and
the particular task tackled in the study.
While EEG and ECoG signals share many characteristics—both are multi-
channel temporal signals with information encoded in frequency and space, with
low signal-to-noise ratio and noisy labels—there are also differences, e.g., a higher
spatial resolution of ECoG, higher signal-to-noise ratio and higher contribution
of informative high gamma band (>70Hz). In motor imagery ECoG decoding,
end-to-end DL is not commonly used. ’Traditional’ ML classifiers are usually
preceded by a feature extraction step creating brain signals representation, typi-
cally in the form of time-frequency features, containing information about power
3limited to the articles that contained all the required information, code adapted
from [12]
End-to-end deep learning for ECoG brain-computer interface 3
Fig. 1: Binned average accuracy difference between best proposed DL model and
’traditional’ baseline on EEG datasets. Error bars denote one standard deviation
of the values in the bin. Bins are equal in size on a logarithmic scale. Points x-axis
position denotes the average dataset size in a bin.
time course in several frequency bands [8, 14] or focused only on low-frequency
component (LFC)/Local Motor Potential (LMP) [14] (detailed analysis can be
found in [19]).
However, a successful application of an end-to-end DL model to motor im-
agery decoding of finger movements trajectory from ECoG was performed with
convolutional layers filtering the raw signal both in temporal and spatial do-
mains followed by LSTM layers [20]. Smart weights initialization was helpful in
achieving high performance. Nevertheless, an average improvement from training
the weights can be estimated as 0.022 ±0.0393 of Pearson r correlation coeffi-
cient, which is relatively small, with 66% of cases noticeable improvement from
end-to-end training (at the level of subjects/fingers). As this was not studied
before, we investigated the differences in data requirements between an end-to-
end model and one using hand-crafted features on a long-term clinical trial BCI
dataset of 3D target reach task. Unique long-term recordings (several months of
experiments, more than 600 min duration in total, compared to few minutes of
ECoG recording available in previous studies, e.g., [20]) allowed us to explore the
relationship between dataset size and the type of feature used for ECoG signal
decoding. In this study, we used architectures previously applied to the ECoG
dataset for decoding motor imagery signals with hand-crafted time-frequency
features as input [16]. In addition, we optimized the temporal filtering layer
with backpropagation seeking a more efficient set of filters that were initialized
to reproduce continuous wavelet transform. We also investigated whether both
approaches react differently to training dataset perturbations which may be the
case due to distinct model properties and may influence the choice of optimal
data processing pipeline for ECoG BCI.
2 Methods
2.1 Dataset
The dataset used in this study was collected as a part of the clinical trial ’BCI and
Tetraplegia’ (ClinicalTrials.gov identifier: NCT02550522, details in [1]) approved
4 M. ´
Sliwowski et al.
by the ethical Committee for the Protection of Individuals (Comit´e de Protec-
tion des Personnes—CPP) with the registration number: 15-CHUG-19 and the
Agency for the Safety of Medicines and Health Products (Agence nationale de
s´ecurit´e du m´edicament et des produits de sant´e—ANSM) with the registra-
tion number: 2015-A00650-49 and the ethical Committee for the Protection of
Individuals (Comit´e de Protection des Personnes—CPP) with the registration
number: 15-CHUG-19.
Fig. 2: Screenshot from the virtual envi-
ronment. The patient is asked to reach
the yellow square (target) with the left
hand (effector) using motor imagery.
In the experiment, a 28-years-
old tetraplegic patient after spinal
cord injury was asked to move the
hands of a virtual avatar displayed
on a screen (see figure 2) using
motor imagery patterns—by imag-
ing/attempting hand movements that
influence brain activity in the mo-
tor cortex. These changes were then
recorded with two WIMAGINE [10]
implants placed over the primary mo-
tor and sensory cortex bilaterally.
Both implants consisted of 8 ×8
grid of electrodes with recording per-
formed using 32 electrodes selected in
a chessboard-like manner due to lim-
ited data transfer with a sampling fre-
quency equal to 586 Hz. Signals from
implants were transferred to the de-
coding system that performed online predictions. First, one out of 5 possible
states (idle, left and right hand translation, left and right wrist rotation) was
selected with a state decoder. Then, for every state (except idle), a multilinear
REW-NPLS model [3] updated online was used to predict 3D movements or 1D
wrist rotation. The dataset consisted of 44 experimental sessions recorded over
more than 200 days. It constitutes 300 and 284 minutes for left and right hand
translation, respectively.
2.2 Data representation and problem
Based on the collected database, we extracted two datasets for left and right
hand translation. The raw signal representation was created from 1-second long
windows of ECoG signal with 90% overlap. Every observation XiR64×590
contained 590 samples for each of the 64 channels corresponding to the number
of electrodes recording the signal.
Every signal window Xiwas paired with the corresponding desired trajectory
yiR3that the patient was asked to follow, i.e., the straight line connecting the
tip of the hand to the target. The trajectories were computed in the 3D virtual
avatar coordinate system mounted in the pelvis of the effector.
摘要:

DeeplearningforECoGbrain-computerinterface:end-to-endvs.hand-craftedfeaturesMaciejSliwowski1;2[0000000167441714],MatthieuMartin1[0000000159548087],AntoineSouloumiac2,PierreBlanchart2,andTetianaAksenova1[0000000340072343]1Univ.GrenobleAlpes,CEA,LETI,Clinatec,F-38000Grenoble,France2UniversiteParis-S...

展开>> 收起<<
Deep learning for ECoG brain-computer interface end-to-end vs. hand-crafted features Maciej Sliwowski120000 000167441714 Matthieu Martin10000 000159548087.pdf

共16页,预览4页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!

相关推荐

分类:图书资源 价格:10玖币 属性:16 页 大小:1.63MB 格式:PDF 时间:2025-05-06

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 16
客服
关注