MVCNet: Multi-View Contrastive Network for Motor Imagery Classification
- URL: http://arxiv.org/abs/2502.17482v3
- Date: Sun, 27 Apr 2025 19:58:19 GMT
- Title: MVCNet: Multi-View Contrastive Network for Motor Imagery Classification
- Authors: Ziwei Wang, Siyang Li, Xiaoqing Chen, Wei Li, Dongrui Wu,
- Abstract summary: Motor imagery (MI) decoding has received significant attention due to its intuitive mechanism.<n>Most existing models rely on single-stream architectures and overlook the multi-view nature of EEG signals, leading to limited performance and generalization.<n>We propose a multi-view contrastive network (MVCNet), a dual-branch architecture that parallelly integrates CNN and Transformer models to capture both local spatial-temporal features and global temporal dependencies.
- Score: 20.78236894605647
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Electroencephalography (EEG)-based brain-computer interfaces (BCIs) enable neural interaction by decoding brain activity for external communication. Motor imagery (MI) decoding has received significant attention due to its intuitive mechanism. However, most existing models rely on single-stream architectures and overlook the multi-view nature of EEG signals, leading to limited performance and generalization. We propose a multi-view contrastive network (MVCNet), a dual-branch architecture that parallelly integrates CNN and Transformer models to capture both local spatial-temporal features and global temporal dependencies. To enhance the informativeness of training data, MVCNet incorporates a unified augmentation pipeline across time, frequency, and spatial domains. Two contrastive modules are further introduced: a cross-view contrastive module that enforces consistency of original and augmented views, and a cross-model contrastive module that aligns features extracted from both branches. Final representations are fused and jointly optimized by contrastive and classification losses. Experiments on five public MI datasets across three scenarios demonstrate that MVCNet consistently outperforms seven state-of-the-art MI decoding networks, highlighting its effectiveness and generalization ability. MVCNet provides a robust solution for MI decoding by integrating multi-view information and dual-branch modeling, contributing to the development of more reliable BCI systems.
Related papers
- An Efficient and Mixed Heterogeneous Model for Image Restoration [71.85124734060665]
Current mainstream approaches are based on three architectural paradigms: CNNs, Transformers, and Mambas.
We propose RestorMixer, an efficient and general-purpose IR model based on mixed-architecture fusion.
arXiv Detail & Related papers (2025-04-15T08:19:12Z) - BIMII-Net: Brain-Inspired Multi-Iterative Interactive Network for RGB-T Road Scene Semantic Segmentation [6.223341988991549]
We propose a novel RGB-T road scene semantic segmentation network called Brain-Inspired Multi-Iteration Interaction Network ( BIMII-Net)
First, to meet the requirements of accurate texture and local information extraction in road scenarios like autonomous driving, we proposed a deep continuous-coupled neural network (DCCNN) architecture based on a brain-inspired model.
Second, to enhance the interaction and expression capabilities among multi-modal information, we designed a cross explicit attention-enhanced fusion module (CEAEF-Module) in the feature fusion stage of BIMII-Net.
Finally, we constructed a complementary interactive multi-layer decoder
arXiv Detail & Related papers (2025-03-25T03:09:46Z) - Multimodal-Aware Fusion Network for Referring Remote Sensing Image Segmentation [7.992331117310217]
Referring remote sensing image segmentation (RRSIS) is a novel visual task in remote sensing images segmentation.
We design a multimodal-aware fusion network (MAFN) to achieve fine-grained alignment and fusion between the two modalities.
arXiv Detail & Related papers (2025-03-14T08:31:21Z) - Optimized Unet with Attention Mechanism for Multi-Scale Semantic Segmentation [8.443350618722564]
This paper proposes an improved Unet model combined with an attention mechanism.
It introduces channel attention and spatial attention modules, enhances the model's ability to focus on important features.
The improved model performs well in terms of mIoU and pixel accuracy (PA), reaching 76.5% and 95.3% respectively.
arXiv Detail & Related papers (2025-02-06T06:51:23Z) - CognitionCapturer: Decoding Visual Stimuli From Human EEG Signal With Multimodal Information [61.1904164368732]
We propose CognitionCapturer, a unified framework that fully leverages multimodal data to represent EEG signals.<n>Specifically, CognitionCapturer trains Modality Experts for each modality to extract cross-modal information from the EEG modality.<n>The framework does not require any fine-tuning of the generative models and can be extended to incorporate more modalities.
arXiv Detail & Related papers (2024-12-13T16:27:54Z) - Online Multi-modal Root Cause Analysis [61.94987309148539]
Root Cause Analysis (RCA) is essential for pinpointing the root causes of failures in microservice systems.
Existing online RCA methods handle only single-modal data overlooking, complex interactions in multi-modal systems.
We introduce OCEAN, a novel online multi-modal causal structure learning method for root cause localization.
arXiv Detail & Related papers (2024-10-13T21:47:36Z) - INF-LLaVA: Dual-perspective Perception for High-Resolution Multimodal Large Language Model [71.50973774576431]
We propose a novel MLLM, INF-LLaVA, designed for effective high-resolution image perception.
We introduce a Dual-perspective Cropping Module (DCM), which ensures that each sub-image contains continuous details from a local perspective.
Second, we introduce Dual-perspective Enhancement Module (DEM) to enable the mutual enhancement of global and local features.
arXiv Detail & Related papers (2024-07-23T06:02:30Z) - A Knowledge-Driven Cross-view Contrastive Learning for EEG
Representation [48.85731427874065]
This paper proposes a knowledge-driven cross-view contrastive learning framework (KDC2) to extract effective representations from EEG with limited labels.
The KDC2 method creates scalp and neural views of EEG signals, simulating the internal and external representation of brain activity.
By modeling prior neural knowledge based on neural information consistency theory, the proposed method extracts invariant and complementary neural knowledge to generate combined representations.
arXiv Detail & Related papers (2023-09-21T08:53:51Z) - Epistemic Graph: A Plug-And-Play Module For Hybrid Representation
Learning [46.48026220464475]
Humans exhibit hybrid learning, seamlessly integrating structured knowledge for cross-domain recognition or relying on a smaller amount of data samples for few-shot learning.
We introduce a novel Epistemic Graph Layer (EGLayer) to enable hybrid learning, enhancing the exchange of information between deep features and a structured knowledge graph.
arXiv Detail & Related papers (2023-05-30T04:10:15Z) - LMDA-Net:A lightweight multi-dimensional attention network for general
EEG-based brain-computer interface paradigms and interpretability [2.3945862743903916]
We propose a novel lightweight multi-dimensional attention network, called LMDA-Net.
By incorporating two novel attention modules designed specifically for EEG signals, LMDA-Net can effectively integrate features from multiple dimensions.
LMDA-Net outperforms other representative methods in terms of classification accuracy and predicting volatility.
arXiv Detail & Related papers (2023-03-29T02:35:02Z) - DoubleU-NetPlus: A Novel Attention and Context Guided Dual U-Net with
Multi-Scale Residual Feature Fusion Network for Semantic Segmentation of
Medical Images [2.20200533591633]
We present a novel dual U-Net-based architecture named DoubleU-NetPlus.
We exploit multi-contextual features and several attention strategies to increase networks' ability to model discriminative feature representation.
To mitigate the gradient vanishing issue and incorporate high-resolution features with deeper spatial details, the standard convolution operation is replaced with the attention-guided residual convolution operations.
arXiv Detail & Related papers (2022-11-25T16:56:26Z) - EEG-ITNet: An Explainable Inception Temporal Convolutional Network for
Motor Imagery Classification [0.5616884466478884]
We propose an end-to-end deep learning architecture called EEG-ITNet.
Our model can extract rich spectral, spatial, and temporal information from multi-channel EEG signals.
EEG-ITNet shows up to 5.9% improvement in the classification accuracy in different scenarios.
arXiv Detail & Related papers (2022-04-14T13:18:43Z) - Tensor-CSPNet: A Novel Geometric Deep Learning Framework for Motor
Imagery Classification [14.95694356964053]
We propose a geometric deep learning framework calledCSPNet to characterize EEG signals on symmetric positive definite (SPD)
CSPNet attains or slightly outperforms the current state-of-the-art performance on the cross-validation and holdout scenarios of two MI-EEG datasets.
arXiv Detail & Related papers (2022-02-05T02:52:23Z) - Cross-Modality Deep Feature Learning for Brain Tumor Segmentation [158.8192041981564]
This paper proposes a novel cross-modality deep feature learning framework to segment brain tumors from the multi-modality MRI data.
The core idea is to mine rich patterns across the multi-modality data to make up for the insufficient data scale.
Comprehensive experiments are conducted on the BraTS benchmarks, which show that the proposed cross-modality deep feature learning framework can effectively improve the brain tumor segmentation performance.
arXiv Detail & Related papers (2022-01-07T07:46:01Z) - Full-Duplex Strategy for Video Object Segmentation [141.43983376262815]
Full- Strategy Network (FSNet) is a novel framework for video object segmentation (VOS)
Our FSNet performs the crossmodal feature-passing (i.e., transmission and receiving) simultaneously before fusion decoding stage.
We show that our FSNet outperforms other state-of-the-arts for both the VOS and video salient object detection tasks.
arXiv Detail & Related papers (2021-08-06T14:50:50Z) - CNN-based Approaches For Cross-Subject Classification in Motor Imagery:
From The State-of-The-Art to DynamicNet [0.2936007114555107]
Motor imagery (MI)-based brain-computer interface (BCI) systems are being increasingly employed to provide alternative means of communication and control.
accurately classifying MI from brain signals is essential to obtain reliable BCI systems.
Deep learning approaches have started to emerge as valid alternatives to standard machine learning techniques.
arXiv Detail & Related papers (2021-05-17T14:57:13Z) - Encoder Fusion Network with Co-Attention Embedding for Referring Image
Segmentation [87.01669173673288]
We propose an encoder fusion network (EFN), which transforms the visual encoder into a multi-modal feature learning network.
A co-attention mechanism is embedded in the EFN to realize the parallel update of multi-modal features.
The experiment results on four benchmark datasets demonstrate that the proposed approach achieves the state-of-the-art performance without any post-processing.
arXiv Detail & Related papers (2021-05-05T02:27:25Z) - MVFNet: Multi-View Fusion Network for Efficient Video Recognition [79.92736306354576]
We introduce a multi-view fusion (MVF) module to exploit video complexity using separable convolution for efficiency.
MVFNet can be thought of as a generalized video modeling framework.
arXiv Detail & Related papers (2020-12-13T06:34:18Z) - Visual Concept Reasoning Networks [93.99840807973546]
A split-transform-merge strategy has been broadly used as an architectural constraint in convolutional neural networks for visual recognition tasks.
We propose to exploit this strategy and combine it with our Visual Concept Reasoning Networks (VCRNet) to enable reasoning between high-level visual concepts.
Our proposed model, VCRNet, consistently improves the performance by increasing the number of parameters by less than 1%.
arXiv Detail & Related papers (2020-08-26T20:02:40Z) - Few-Shot Relation Learning with Attention for EEG-based Motor Imagery
Classification [11.873435088539459]
Brain-Computer Interfaces (BCI) based on Electroencephalography (EEG) signals have received a lot of attention.
Motor imagery (MI) data can be used to aid rehabilitation as well as in autonomous driving scenarios.
classification of MI signals is vital for EEG-based BCI systems.
arXiv Detail & Related papers (2020-03-03T02:34:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.