Impact of dataset size and long-term ECoG-based BCI usage on deep
learning decoders performance
- URL: http://arxiv.org/abs/2209.03789v1
- Date: Thu, 8 Sep 2022 13:01:05 GMT
- Title: Impact of dataset size and long-term ECoG-based BCI usage on deep
learning decoders performance
- Authors: Maciej \'Sliwowski, Matthieu Martin, Antoine Souloumiac, Pierre
Blanchart, Tetiana Aksenova
- Abstract summary: In brain-computer interfaces (BCI) research, recording data is time-consuming and expensive.
Can we achieve higher decoding performance with more data to train decoders?
High decoding performance was obtained with relatively small datasets recorded later in the experiment.
- Score: 4.7773230870500605
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In brain-computer interfaces (BCI) research, recording data is time-consuming
and expensive, which limits access to big datasets. This may influence the BCI
system performance as machine learning methods depend strongly on the training
dataset size. Important questions arise: taking into account neuronal signal
characteristics (e.g., non-stationarity), can we achieve higher decoding
performance with more data to train decoders? What is the perspective for
further improvement with time in the case of long-term BCI studies? In this
study, we investigated the impact of long-term recordings on motor imagery
decoding from two main perspectives: model requirements regarding dataset size
and potential for patient adaptation. We evaluated the multilinear model and
two deep learning (DL) models on a long-term BCI and Tetraplegia NCT02550522
clinical trial dataset containing 43 sessions of ECoG recordings performed with
a tetraplegic patient. In the experiment, a participant executed 3D virtual
hand translation using motor imagery patterns. We designed multiple
computational experiments in which training datasets were increased or
translated to investigate the relationship between models' performance and
different factors influencing recordings. Our analysis showed that adding more
data to the training dataset may not instantly increase performance for
datasets already containing 40 minutes of the signal. DL decoders showed
similar requirements regarding the dataset size compared to the multilinear
model while demonstrating higher decoding performance. Moreover, high decoding
performance was obtained with relatively small datasets recorded later in the
experiment, suggesting motor imagery patterns improvement and patient
adaptation. Finally, we proposed UMAP embeddings and local intrinsic
dimensionality as a way to visualize the data and potentially evaluate data
quality.
Related papers
- Scaling Wearable Foundation Models [54.93979158708164]
We investigate the scaling properties of sensor foundation models across compute, data, and model size.
Using a dataset of up to 40 million hours of in-situ heart rate, heart rate variability, electrodermal activity, accelerometer, skin temperature, and altimeter per-minute data from over 165,000 people, we create LSM.
Our results establish the scaling laws of LSM for tasks such as imputation, extrapolation, both across time and sensor modalities.
arXiv Detail & Related papers (2024-10-17T15:08:21Z) - Multi-OCT-SelfNet: Integrating Self-Supervised Learning with Multi-Source Data Fusion for Enhanced Multi-Class Retinal Disease Classification [2.5091334993691206]
Development of a robust deep-learning model for retinal disease diagnosis requires a substantial dataset for training.
The capacity to generalize effectively on smaller datasets remains a persistent challenge.
We've combined a wide range of data sources to improve performance and generalization to new data.
arXiv Detail & Related papers (2024-09-17T17:22:35Z) - LESS: Selecting Influential Data for Targeted Instruction Tuning [64.78894228923619]
We propose LESS, an efficient algorithm to estimate data influences and perform Low-rank gradiEnt Similarity Search for instruction data selection.
We show that training on a LESS-selected 5% of the data can often outperform training on the full dataset across diverse downstream tasks.
Our method goes beyond surface form cues to identify data that the necessary reasoning skills for the intended downstream application.
arXiv Detail & Related papers (2024-02-06T19:18:04Z) - Improving age prediction: Utilizing LSTM-based dynamic forecasting for
data augmentation in multivariate time series analysis [16.91773394335563]
We propose a data augmentation and validation framework that utilizes dynamic forecasting with Long Short-Term Memory (LSTM) networks to enrich datasets.
The effectiveness of these augmented datasets was then compared with the original data using various deep learning models designed for chronological age prediction tasks.
arXiv Detail & Related papers (2023-12-11T22:47:26Z) - The effect of data augmentation and 3D-CNN depth on Alzheimer's Disease
detection [51.697248252191265]
This work summarizes and strictly observes best practices regarding data handling, experimental design, and model evaluation.
We focus on Alzheimer's Disease (AD) detection, which serves as a paradigmatic example of challenging problem in healthcare.
Within this framework, we train predictive 15 models, considering three different data augmentation strategies and five distinct 3D CNN architectures.
arXiv Detail & Related papers (2023-09-13T10:40:41Z) - Convolutional Monge Mapping Normalization for learning on sleep data [63.22081662149488]
We propose a new method called Convolutional Monge Mapping Normalization (CMMN)
CMMN consists in filtering the signals in order to adapt their power spectrum density (PSD) to a Wasserstein barycenter estimated on training data.
Numerical experiments on sleep EEG data show that CMMN leads to significant and consistent performance gains independent from the neural network architecture.
arXiv Detail & Related papers (2023-05-30T08:24:01Z) - Deep learning for ECoG brain-computer interface: end-to-end vs.
hand-crafted features [4.7773230870500605]
Brain signals are temporal data with a low signal-to-noise ratio, uncertain labels, and nonstationary data in time.
These factors may influence the training process and slow down the models' performance improvement.
This paper compares models that use raw ECoG signal and time-frequency features for BCI motor imagery decoding.
arXiv Detail & Related papers (2022-10-05T20:18:30Z) - Core-set Selection Using Metrics-based Explanations (CSUME) for
multiclass ECG [2.0520503083305073]
We show how a selection of good quality data improves deep learning model performance.
Our experimental results show a 9.67% and 8.69% precision and recall improvement with a significant training data volume reduction of 50%.
arXiv Detail & Related papers (2022-05-28T19:36:28Z) - Improving Classifier Training Efficiency for Automatic Cyberbullying
Detection with Feature Density [58.64907136562178]
We study the effectiveness of Feature Density (FD) using different linguistically-backed feature preprocessing methods.
We hypothesise that estimating dataset complexity allows for the reduction of the number of required experiments.
The difference in linguistic complexity of datasets allows us to additionally discuss the efficacy of linguistically-backed word preprocessing.
arXiv Detail & Related papers (2021-11-02T15:48:28Z) - Evaluating deep transfer learning for whole-brain cognitive decoding [11.898286908882561]
Transfer learning (TL) is well-suited to improve the performance of deep learning (DL) models in datasets with small numbers of samples.
Here, we evaluate TL for the application of DL models to the decoding of cognitive states from whole-brain functional Magnetic Resonance Imaging (fMRI) data.
arXiv Detail & Related papers (2021-11-01T15:44:49Z) - Deep Cellular Recurrent Network for Efficient Analysis of Time-Series
Data with Spatial Information [52.635997570873194]
This work proposes a novel deep cellular recurrent neural network (DCRNN) architecture to process complex multi-dimensional time series data with spatial information.
The proposed architecture achieves state-of-the-art performance while utilizing substantially less trainable parameters when compared to comparable methods in the literature.
arXiv Detail & Related papers (2021-01-12T20:08:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.