Visual Neural Decoding via Improved Visual-EEG Semantic Consistency
- URL: http://arxiv.org/abs/2408.06788v1
- Date: Tue, 13 Aug 2024 10:16:10 GMT
- Title: Visual Neural Decoding via Improved Visual-EEG Semantic Consistency
- Authors: Hongzhou Chen, Lianghua He, Yihang Liu, Longzhen Yang,
- Abstract summary: Methods that directly map EEG features to the CLIP embedding space may introduce mapping bias and cause semantic inconsistency.
We propose a Visual-EEG Semantic Decouple Framework that explicitly extracts the semantic-related features of these two modalities to facilitate optimal alignment.
Our method achieves state-of-the-art results in zero-shot neural decoding tasks.
- Score: 3.4061238650474657
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Visual neural decoding refers to the process of extracting and interpreting original visual experiences from human brain activity. Recent advances in metric learning-based EEG visual decoding methods have delivered promising results and demonstrated the feasibility of decoding novel visual categories from brain activity. However, methods that directly map EEG features to the CLIP embedding space may introduce mapping bias and cause semantic inconsistency among features, thereby degrading alignment and impairing decoding performance. To further explore the semantic consistency between visual and neural signals. In this work, we construct a joint semantic space and propose a Visual-EEG Semantic Decouple Framework that explicitly extracts the semantic-related features of these two modalities to facilitate optimal alignment. Specifically, a cross-modal information decoupling module is introduced to guide the extraction of semantic-related information from modalities. Then, by quantifying the mutual information between visual image and EEG features, we observe a strong positive correlation between the decoding performance and the magnitude of mutual information. Furthermore, inspired by the mechanisms of visual object understanding from neuroscience, we propose an intra-class geometric consistency approach during the alignment process. This strategy maps visual samples within the same class to consistent neural patterns, which further enhances the robustness and the performance of EEG visual decoding. Experiments on a large Image-EEG dataset show that our method achieves state-of-the-art results in zero-shot neural decoding tasks.
Related papers
- Graph Contrastive Learning for Connectome Classification [7.444875183336163]
Graph signal processing is a key tool in unraveling the interplay between the brain's function and structure.
Our work represents a further step in this direction by exploring supervised contrastive learning methods.
A proposed framework achieves state-of-the-art performance in a gender classification task using Human Connectome Project data.
arXiv Detail & Related papers (2025-02-07T17:30:47Z) - Neural-MCRL: Neural Multimodal Contrastive Representation Learning for EEG-based Visual Decoding [2.587640069216139]
Decoding neural visual representations from electroencephalogram (EEG)-based brain activity is crucial for advancing brain-machine interfaces (BMI)
Existing methods often overlook semantic consistency and completeness within modalities and lack effective semantic alignment across modalities.
We propose Neural-MCRL, a novel framework that achieves multimodal alignment through semantic bridging and cross-attention mechanisms.
arXiv Detail & Related papers (2024-12-23T07:02:44Z) - CognitionCapturer: Decoding Visual Stimuli From Human EEG Signal With Multimodal Information [61.1904164368732]
We propose CognitionCapturer, a unified framework that fully leverages multimodal data to represent EEG signals.
Specifically, CognitionCapturer trains Modality Experts for each modality to extract cross-modal information from the EEG modality.
The framework does not require any fine-tuning of the generative models and can be extended to incorporate more modalities.
arXiv Detail & Related papers (2024-12-13T16:27:54Z) - MLIP: Enhancing Medical Visual Representation with Divergence Encoder
and Knowledge-guided Contrastive Learning [48.97640824497327]
We propose a novel framework leveraging domain-specific medical knowledge as guiding signals to integrate language information into the visual domain through image-text contrastive learning.
Our model includes global contrastive learning with our designed divergence encoder, local token-knowledge-patch alignment contrastive learning, and knowledge-guided category-level contrastive learning with expert knowledge.
Notably, MLIP surpasses state-of-the-art methods even with limited annotated data, highlighting the potential of multimodal pre-training in advancing medical representation learning.
arXiv Detail & Related papers (2024-02-03T05:48:50Z) - Learning Robust Deep Visual Representations from EEG Brain Recordings [13.768240137063428]
This study proposes a two-stage method where the first step is to obtain EEG-derived features for robust learning of deep representations.
We demonstrate the generalizability of our feature extraction pipeline across three different datasets using deep-learning architectures.
We propose a novel framework to transform unseen images into the EEG space and reconstruct them with approximation.
arXiv Detail & Related papers (2023-10-25T10:26:07Z) - A Knowledge-Driven Cross-view Contrastive Learning for EEG
Representation [48.85731427874065]
This paper proposes a knowledge-driven cross-view contrastive learning framework (KDC2) to extract effective representations from EEG with limited labels.
The KDC2 method creates scalp and neural views of EEG signals, simulating the internal and external representation of brain activity.
By modeling prior neural knowledge based on neural information consistency theory, the proposed method extracts invariant and complementary neural knowledge to generate combined representations.
arXiv Detail & Related papers (2023-09-21T08:53:51Z) - Controllable Mind Visual Diffusion Model [58.83896307930354]
Brain signal visualization has emerged as an active research area, serving as a critical interface between the human visual system and computer vision models.
We propose a novel approach, referred to as Controllable Mind Visual Model Diffusion (CMVDM)
CMVDM extracts semantic and silhouette information from fMRI data using attribute alignment and assistant networks.
We then leverage a control model to fully exploit the extracted information for image synthesis, resulting in generated images that closely resemble the visual stimuli in terms of semantics and silhouette.
arXiv Detail & Related papers (2023-05-17T11:36:40Z) - See What You See: Self-supervised Cross-modal Retrieval of Visual
Stimuli from Brain Activity [37.837710340954374]
We present a single-stage EEG-visual retrieval paradigm where data of two modalities are correlated, as opposed to their annotations.
We demonstrate the proposed approach completes an instance-level EEG-visual retrieval task which existing methods cannot.
arXiv Detail & Related papers (2022-08-07T08:11:15Z) - Cross-modal Representation Learning for Zero-shot Action Recognition [67.57406812235767]
We present a cross-modal Transformer-based framework, which jointly encodes video data and text labels for zero-shot action recognition (ZSAR)
Our model employs a conceptually new pipeline by which visual representations are learned in conjunction with visual-semantic associations in an end-to-end manner.
Experiment results show our model considerably improves upon the state of the arts in ZSAR, reaching encouraging top-1 accuracy on UCF101, HMDB51, and ActivityNet benchmark datasets.
arXiv Detail & Related papers (2022-05-03T17:39:27Z) - CogAlign: Learning to Align Textual Neural Representations to Cognitive
Language Processing Signals [60.921888445317705]
We propose a CogAlign approach to integrate cognitive language processing signals into natural language processing models.
We show that CogAlign achieves significant improvements with multiple cognitive features over state-of-the-art models on public datasets.
arXiv Detail & Related papers (2021-06-10T07:10:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.