Deep learning for ECoG brain-computer interface end-to-end vs. hand-crafted features Maciej Sliwowski120000 000167441714 Matthieu Martin10000 000159548087

2025-05-06 0 0 1.63MB 16 页 10玖币

侵权投诉

Deep learning for ECoG brain-computer

interface: end-to-end vs. hand-crafted features

Maciej ´

Sliwowski1,2[0000−0001−6744−1714], Matthieu Martin1[0000−0001−5954−8087],

Antoine Souloumiac2, Pierre Blanchart2, and Tetiana

Aksenova1[0000−0003−4007−2343]

1Univ. Grenoble Alpes, CEA, LETI, Clinatec, F-38000 Grenoble, France

2Universit´e Paris-Saclay, CEA, List, F-91120, Palaiseau, France

Abstract. In brain signal processing, deep learning (DL) models have

become commonly used. However, the performance gain from using end-

to-end DL models compared to conventional ML approaches is usually

signiﬁcant but moderate, typically at the cost of increased computational

load and deteriorated explainability. The core idea behind deep learning

approaches is scaling the performance with bigger datasets. However,

brain signals are temporal data with a low signal-to-noise ratio, uncertain

labels, and nonstationary data in time. Those factors may inﬂuence the

training process and slow down the models’ performance improvement.

These factors’ inﬂuence may diﬀer for end-to-end DL model and one

using hand-crafted features.

As not studied before, this paper compares models that use raw ECoG

signal and time-frequency features for BCI motor imagery decoding. We

investigate whether the current dataset size is a stronger limitation for

any models. Finally, obtained ﬁlters were compared to identify diﬀerences

between hand-crafted features and optimized with backpropagation. To

compare the eﬀectiveness of both strategies, we used a multilayer per-

ceptron and a mix of convolutional and LSTM layers that were already

proved eﬀective in this task. The analysis was performed on the long-term

clinical trial database (almost 600 minutes of recordings) of a tetraplegic

patient executing motor imagery tasks for 3D hand translation.

For a given dataset, the results showed that end-to-end training might

not be signiﬁcantly better than the hand-crafted features-based model.

The performance gap is reduced with bigger datasets, but considering the

increased computational load, end-to-end training may not be proﬁtable

for this application.

Keywords: deep learning ·ECoG ·brain-computer interfaces ·dataset

size ·motor imagery ·end-to-end

1 Introduction

In the last decade, deep learning (DL) models achieved extraordinary perfor-

mance in a variety of complex real-life tasks, e.g., computer vision [4], nat-

ural language processing [2], compared to previously developed models. This

arXiv:2210.02544v2 [eess.SP] 12 Oct 2022

2 M. ´

Sliwowski et al.

was possible mainly thanks to the improvements of data processing units and,

most importantly, increased dataset sizes [4]. Generally, in brain-computer in-

terfaces (BCI) research, access to large databases of brain signals is limited

due to the experimental and medical constraints as well as the immensity of

paradigms/hardware combinations. Given limited datasets, can we still train

end-to-end (E2E) DL models for the medical BCI application as eﬀectively as

in computer vision?

In 2019, Roy et al. [12] reported that the number of studies classifying EEG

signals with deep learning using hand-crafted features (mainly frequency do-

main) and raw EEG signals (end-to-end) was similar. This indicates that de-

coding EEG from raw signals is indeed possible. However, in many articles,

researchers decided to use harder to design hand-crafted features. While end-

to-end models dominated computer vision, in brain signals processing, it is still

common to use features extracted as an input to the DL models. It is unclear

whether speciﬁc signal characteristics cause this, e.g., nonstationarity in time

making the creation of a homogeneous dataset impractical, low signal-to-noise

ratio complicating the optimization process and favoring overﬁtting, labels un-

certainty originating from human-in-the-loop experimental setup, or researchers’

bias toward solutions better understood and more explainable.

Most studies do not directly compare DL using end-to-end and hand-crafted

features approaches. Usually, DL architectures are compared with each other

and with an additional ’traditional’ ML pipeline, e.g., ﬁlter-bank common spatial

pattern (FBCSP) in [15], xDAWN and FBCSP in [5], SVM and FBCSP in [17].

In ﬁgure 1, we presented accuracy improvement of the best proposed DL model

compared to the ’traditional’ baseline for articles analyzed in [12] 3depending on

the recording time and the number of examples in the dataset. The gap between

performance improvement of DL compared to the ’traditional’ baseline increases

with the dataset size (except for the last points on the plot, which contain

signiﬁcantly fewer studies). In the right plot, the diﬀerence between models using

raw EEG and frequency domain features increases which may exhibit a boost

of end-to-end models with access to bigger datasets compared to hand-crafted

features. As the proposed DL models are usually compared to the baseline,

the boost of end-to-end models cannot be clearly stated because the accuracy

diﬀerence depends strongly on the ’traditional’ baseline model performance and

the particular task tackled in the study.

While EEG and ECoG signals share many characteristics—both are multi-

channel temporal signals with information encoded in frequency and space, with

low signal-to-noise ratio and noisy labels—there are also diﬀerences, e.g., a higher

spatial resolution of ECoG, higher signal-to-noise ratio and higher contribution

of informative high gamma band (>70Hz). In motor imagery ECoG decoding,

end-to-end DL is not commonly used. ’Traditional’ ML classiﬁers are usually

preceded by a feature extraction step creating brain signals representation, typi-

cally in the form of time-frequency features, containing information about power

3limited to the articles that contained all the required information, code adapted

from [12]

End-to-end deep learning for ECoG brain-computer interface 3

Fig. 1: Binned average accuracy diﬀerence between best proposed DL model and

’traditional’ baseline on EEG datasets. Error bars denote one standard deviation

of the values in the bin. Bins are equal in size on a logarithmic scale. Points x-axis

position denotes the average dataset size in a bin.

time course in several frequency bands [8, 14] or focused only on low-frequency

component (LFC)/Local Motor Potential (LMP) [14] (detailed analysis can be

found in [19]).

However, a successful application of an end-to-end DL model to motor im-

agery decoding of ﬁnger movements trajectory from ECoG was performed with

convolutional layers ﬁltering the raw signal both in temporal and spatial do-

mains followed by LSTM layers [20]. Smart weights initialization was helpful in

achieving high performance. Nevertheless, an average improvement from training

the weights can be estimated as 0.022 ±0.0393 of Pearson r correlation coeﬃ-

cient, which is relatively small, with 66% of cases noticeable improvement from

end-to-end training (at the level of subjects/ﬁngers). As this was not studied

before, we investigated the diﬀerences in data requirements between an end-to-

end model and one using hand-crafted features on a long-term clinical trial BCI

dataset of 3D target reach task. Unique long-term recordings (several months of

experiments, more than 600 min duration in total, compared to few minutes of

ECoG recording available in previous studies, e.g., [20]) allowed us to explore the

relationship between dataset size and the type of feature used for ECoG signal

decoding. In this study, we used architectures previously applied to the ECoG

dataset for decoding motor imagery signals with hand-crafted time-frequency

features as input [16]. In addition, we optimized the temporal ﬁltering layer

with backpropagation seeking a more eﬃcient set of ﬁlters that were initialized

to reproduce continuous wavelet transform. We also investigated whether both

approaches react diﬀerently to training dataset perturbations which may be the

case due to distinct model properties and may inﬂuence the choice of optimal

data processing pipeline for ECoG BCI.

2 Methods

2.1 Dataset

The dataset used in this study was collected as a part of the clinical trial ’BCI and

Tetraplegia’ (ClinicalTrials.gov identiﬁer: NCT02550522, details in [1]) approved

4 M. ´

Sliwowski et al.

by the ethical Committee for the Protection of Individuals (Comit´e de Protec-

tion des Personnes—CPP) with the registration number: 15-CHUG-19 and the

Agency for the Safety of Medicines and Health Products (Agence nationale de

s´ecurit´e du m´edicament et des produits de sant´e—ANSM) with the registra-

tion number: 2015-A00650-49 and the ethical Committee for the Protection of

Individuals (Comit´e de Protection des Personnes—CPP) with the registration

number: 15-CHUG-19.

Fig. 2: Screenshot from the virtual envi-

ronment. The patient is asked to reach

the yellow square (target) with the left

hand (eﬀector) using motor imagery.

In the experiment, a 28-years-

old tetraplegic patient after spinal

cord injury was asked to move the

hands of a virtual avatar displayed

on a screen (see ﬁgure 2) using

motor imagery patterns—by imag-

ing/attempting hand movements that

inﬂuence brain activity in the mo-

tor cortex. These changes were then

recorded with two WIMAGINE [10]

implants placed over the primary mo-

tor and sensory cortex bilaterally.

Both implants consisted of 8 ×8

grid of electrodes with recording per-

formed using 32 electrodes selected in

a chessboard-like manner due to lim-

ited data transfer with a sampling fre-

quency equal to 586 Hz. Signals from

implants were transferred to the de-

coding system that performed online predictions. First, one out of 5 possible

states (idle, left and right hand translation, left and right wrist rotation) was

selected with a state decoder. Then, for every state (except idle), a multilinear

REW-NPLS model [3] updated online was used to predict 3D movements or 1D

wrist rotation. The dataset consisted of 44 experimental sessions recorded over

more than 200 days. It constitutes 300 and 284 minutes for left and right hand

translation, respectively.

2.2 Data representation and problem

Based on the collected database, we extracted two datasets for left and right

hand translation. The raw signal representation was created from 1-second long

windows of ECoG signal with 90% overlap. Every observation Xi∈R64×590

contained 590 samples for each of the 64 channels corresponding to the number

of electrodes recording the signal.

Every signal window Xiwas paired with the corresponding desired trajectory

yi∈R3that the patient was asked to follow, i.e., the straight line connecting the

tip of the hand to the target. The trajectories were computed in the 3D virtual

avatar coordinate system mounted in the pelvis of the eﬀector.

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

DeeplearningforECoGbrain-computerinterface:end-to-endvs.hand-craftedfeaturesMaciejSliwowski1;2[0000000167441714],MatthieuMartin1[0000000159548087],AntoineSouloumiac2,PierreBlanchart2,andTetianaAksenova1[0000000340072343]1Univ.GrenobleAlpes,CEA,LETI,Clinatec,F-38000Grenoble,France2UniversiteParis-S...

展开>> 收起<<

Deep learning for ECoG brain-computer interface end-to-end vs. hand-crafted features Maciej Sliwowski120000 000167441714 Matthieu Martin10000 000159548087.pdf

共16页,预览4页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

Deep learning for ECoG brain-computer interface end-to-end vs. hand-crafted features Maciej Sliwowski120000 000167441714 Matthieu Martin10000 000159548087

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: