Related papers: Advancing EEG-Based Gaze Prediction Using Depthwise Separable Convolution and Enhanced Pre-Processing

Advancing EEG-Based Gaze Prediction Using Depthwise Separable Convolution and Enhanced Pre-Processing

URL: http://arxiv.org/abs/2408.03480v1
Date: Tue, 6 Aug 2024 23:43:03 GMT
Title: Advancing EEG-Based Gaze Prediction Using Depthwise Separable Convolution and Enhanced Pre-Processing
Authors: Matthew L Key, Tural Mehtiyev, Xiaodong Qu,
Abstract summary: We introduce a novel method, the EEG Deeper Clustered Vision Transformer (EEG-DCViT), which combines depthwise separable convolutional neural networks (CNNs) with vision transformers. The new approach demonstrates superior performance, establishing a new benchmark with a Root Mean Square Error (RMSE) of 51.6 mm.
Score: 0.8192907805418583
License: http://creativecommons.org/licenses/by/4.0/
Abstract: In the field of EEG-based gaze prediction, the application of deep learning to interpret complex neural data poses significant challenges. This study evaluates the effectiveness of pre-processing techniques and the effect of additional depthwise separable convolution on EEG vision transformers (ViTs) in a pretrained model architecture. We introduce a novel method, the EEG Deeper Clustered Vision Transformer (EEG-DCViT), which combines depthwise separable convolutional neural networks (CNNs) with vision transformers, enriched by a pre-processing strategy involving data clustering. The new approach demonstrates superior performance, establishing a new benchmark with a Root Mean Square Error (RMSE) of 51.6 mm. This achievement underscores the impact of pre-processing and model refinement in enhancing EEG-based applications.

Related papers

BHViT: Binarized Hybrid Vision Transformer [53.38894971164072]
Model binarization has made significant progress in enabling real-time and energy-efficient computation for convolutional neural networks (CNN) We propose BHViT, a binarization-friendly hybrid ViT architecture and its full binarization model with the guidance of three important observations. Our proposed algorithm achieves SOTA performance among binary ViT methods.
arXiv Detail & Related papers (2025-03-04T08:35:01Z)
Causal Transformer for Fusion and Pose Estimation in Deep Visual Inertial Odometry [1.2289361708127877]
We propose a causal visual-inertial fusion transformer (VIFT) for pose estimation in deep visual-inertial odometry. The proposed method is end-to-end trainable and requires only a monocular camera and IMU during inference.
arXiv Detail & Related papers (2024-09-13T12:21:25Z)
How Homogenizing the Channel-wise Magnitude Can Enhance EEG Classification Model? [4.0871083166108395]
We propose a simple yet effective approach for EEG data pre-processing. Our method first transforms the EEG data into an encoded image by an Inverted Channel-wise Magnitude Homogenization. By doing so, we can improve the EEG learning process efficiently without using a huge Deep Learning network.
arXiv Detail & Related papers (2024-07-19T09:11:56Z)
Fusing Pretrained ViTs with TCNet for Enhanced EEG Regression [0.07999703756441758]
This paper details the integration of pre-trained Vision Transformers (ViTs) with Temporal Convolutional Networks (TCNet) to enhance the precision of EEG regression. Our results showcase a substantial improvement in regression accuracy, as evidenced by the reduction of Root Mean Square Error (RMSE) from 55.4 to 51.8. Without sacrificing performance, we increase the speed of this model by an order of magnitude (up to 4.32x faster)
arXiv Detail & Related papers (2024-04-02T17:01:51Z)
Leveraging the Power of Data Augmentation for Transformer-based Tracking [64.46371987827312]
We propose two data augmentation methods customized for tracking. First, we optimize existing random cropping via a dynamic search radius mechanism and simulation for boundary samples. Second, we propose a token-level feature mixing augmentation strategy, which enables the model against challenges like background interference.
arXiv Detail & Related papers (2023-09-15T09:18:54Z)
DGSD: Dynamical Graph Self-Distillation for EEG-Based Auditory Spatial Attention Detection [49.196182908826565]
Auditory Attention Detection (AAD) aims to detect target speaker from brain signals in a multi-speaker environment. Current approaches primarily rely on traditional convolutional neural network designed for processing Euclidean data like images. This paper proposes a dynamical graph self-distillation (DGSD) approach for AAD, which does not require speech stimuli as input.
arXiv Detail & Related papers (2023-09-07T13:43:46Z)
End-to-End Meta-Bayesian Optimisation with Transformer Neural Processes [52.818579746354665]
This paper proposes the first end-to-end differentiable meta-BO framework that generalises neural processes to learn acquisition functions via transformer architectures. We enable this end-to-end framework with reinforcement learning (RL) to tackle the lack of labelled acquisition data.
arXiv Detail & Related papers (2023-05-25T10:58:46Z)
Data augmentation for learning predictive models on EEG: a systematic comparison [79.84079335042456]
deep learning for electroencephalography (EEG) classification tasks has been rapidly growing in the last years. Deep learning for EEG classification tasks has been limited by the relatively small size of EEG datasets. Data augmentation has been a key ingredient to obtain state-of-the-art performances across applications such as computer vision or speech.
arXiv Detail & Related papers (2022-06-29T09:18:15Z)
Benchmarking Detection Transfer Learning with Vision Transformers [60.97703494764904]
complexity of object detection methods can make benchmarking non-trivial when new architectures, such as Vision Transformer (ViT) models, arrive. We present training techniques that overcome these challenges, enabling the use of standard ViT models as the backbone of Mask R-CNN. Our results show that recent masking-based unsupervised learning methods may, for the first time, provide convincing transfer learning improvements on COCO.
arXiv Detail & Related papers (2021-11-22T18:59:15Z)
GANSER: A Self-supervised Data Augmentation Framework for EEG-based Emotion Recognition [15.812231441367022]
We propose a novel data augmentation framework, namely Generative Adversarial Network-based Self-supervised Data Augmentation (GANSER) As the first to combine adversarial training with self-supervised learning for EEG-based emotion recognition, the proposed framework can generate high-quality simulated EEG samples. A transformation function is employed to mask parts of EEG signals and force the generator to synthesize potential EEG signals based on the remaining parts.
arXiv Detail & Related papers (2021-09-07T14:42:55Z)
ScalingNet: extracting features from raw EEG data for emotion recognition [4.047737925426405]
We propose a novel convolutional layer allowing to adaptively extract effective data-driven spectrogram-like features from raw EEG signals. The proposed neural network architecture based on the scaling layer, references as ScalingNet, has achieved the state-of-the-art result across the established DEAP benchmark dataset.
arXiv Detail & Related papers (2021-02-07T08:54:27Z)
EEG-Inception: An Accurate and Robust End-to-End Neural Network for EEG-based Motor Imagery Classification [123.93460670568554]
This paper proposes a novel convolutional neural network (CNN) architecture for accurate and robust EEG-based motor imagery (MI) classification. The proposed CNN model, namely EEG-Inception, is built on the backbone of the Inception-Time network. The proposed network is an end-to-end classification, as it takes the raw EEG signals as the input and does not require complex EEG signal-preprocessing.
arXiv Detail & Related papers (2021-01-24T19:03:10Z)

This list is automatically generated from the titles and abstracts of the papers in this site.