Related papers: Visual Neural Decoding via Improved Visual-EEG Semantic Consistency

Visual Neural Decoding via Improved Visual-EEG Semantic Consistency

URL: http://arxiv.org/abs/2408.06788v1
Date: Tue, 13 Aug 2024 10:16:10 GMT
Title: Visual Neural Decoding via Improved Visual-EEG Semantic Consistency
Authors: Hongzhou Chen, Lianghua He, Yihang Liu, Longzhen Yang,
Abstract summary: Methods that directly map EEG features to the CLIP embedding space may introduce mapping bias and cause semantic inconsistency. We propose a Visual-EEG Semantic Decouple Framework that explicitly extracts the semantic-related features of these two modalities to facilitate optimal alignment. Our method achieves state-of-the-art results in zero-shot neural decoding tasks.
Score: 3.4061238650474657
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Visual neural decoding refers to the process of extracting and interpreting original visual experiences from human brain activity. Recent advances in metric learning-based EEG visual decoding methods have delivered promising results and demonstrated the feasibility of decoding novel visual categories from brain activity. However, methods that directly map EEG features to the CLIP embedding space may introduce mapping bias and cause semantic inconsistency among features, thereby degrading alignment and impairing decoding performance. To further explore the semantic consistency between visual and neural signals. In this work, we construct a joint semantic space and propose a Visual-EEG Semantic Decouple Framework that explicitly extracts the semantic-related features of these two modalities to facilitate optimal alignment. Specifically, a cross-modal information decoupling module is introduced to guide the extraction of semantic-related information from modalities. Then, by quantifying the mutual information between visual image and EEG features, we observe a strong positive correlation between the decoding performance and the magnitude of mutual information. Furthermore, inspired by the mechanisms of visual object understanding from neuroscience, we propose an intra-class geometric consistency approach during the alignment process. This strategy maps visual samples within the same class to consistent neural patterns, which further enhances the robustness and the performance of EEG visual decoding. Experiments on a large Image-EEG dataset show that our method achieves state-of-the-art results in zero-shot neural decoding tasks.

Related papers

Spatial-Functional awareness Transformer-based graph archetype contrastive learning for Decoding Visual Neural Representations from EEG [3.661246946935037]
We propose a Spatial-Functional Awareness Transformer-based Graph Archetype Contrastive Learning (SFTG) framework to enhance EEG-based visual decoding.<n>Specifically, we introduce the EEG Graph Transformer (EGT), a novel graph-based neural architecture that simultaneously encodes spatial brain connectivity and temporal neural dynamics.<n>To mitigate high intra-subject variability, we propose Graph Archetype Contrastive Learning (GAC), which learns subject-specific EEG graph archetypes to improve feature consistency and class separability.
arXiv Detail & Related papers (2025-09-29T13:27:55Z)
SynBrain: Enhancing Visual-to-fMRI Synthesis via Probabilistic Representation Learning [54.390403684665834]
Deciphering how visual stimuli are transformed into cortical responses is a fundamental challenge in computational neuroscience.<n>We propose SynBrain, a generative framework that simulates the transformation from visual semantics to neural responses in a probabilistic and biologically interpretable manner.<n> Experimental results demonstrate that SynBrain surpasses state-of-the-art methods in subject-specific visual-to-fMRI encoding performance.
arXiv Detail & Related papers (2025-08-14T03:01:05Z)
Interpretable EEG-to-Image Generation with Semantic Prompts [6.712646807032639]
Our model bypasses direct EEG-to-image generation by aligning EEG signals with semantic captions.<n>A transformer-based EEG encoder maps brain activity to these captions through contrastive learning.<n>This text-mediated framework yields state-of-the-art visual decoding on the EEGCVPR dataset.
arXiv Detail & Related papers (2025-07-09T17:18:06Z)
Perception Activator: An intuitive and portable framework for brain cognitive exploration [19.851643249367108]
We develop an experimental framework that uses fMRI representations as intervention conditions.<n>We compare both downstream performance and intermediate feature changes on object detection and instance segmentation tasks with and without fMRI information.<n>Our results prove that fMRI contains rich multi-object semantic cues and coarse spatial localization information-elements that current models have yet to fully exploit or integrate.
arXiv Detail & Related papers (2025-07-03T04:46:48Z)
Category-aware EEG image generation based on wavelet transform and contrast semantic loss [4.165508411354963]
We propose a transformer-based EEG signal encoder integrating the Discrete Wavelet Transform (DWT) and the gating mechanism.<n> Guided by the feature alignment and category-aware fusion losses, this encoder is used to extract features related to visual stimuli from EEG signals.<n>With the aid of a pre-trained diffusion model, these features are reconstructed into visual stimuli.
arXiv Detail & Related papers (2025-05-30T07:24:58Z)
Graph Contrastive Learning for Connectome Classification [7.444875183336163]
Graph signal processing is a key tool in unraveling the interplay between the brain's function and structure. Our work represents a further step in this direction by exploring supervised contrastive learning methods. A proposed framework achieves state-of-the-art performance in a gender classification task using Human Connectome Project data.
arXiv Detail & Related papers (2025-02-07T17:30:47Z)
Neural-MCRL: Neural Multimodal Contrastive Representation Learning for EEG-based Visual Decoding [2.587640069216139]
Decoding neural visual representations from electroencephalogram (EEG)-based brain activity is crucial for advancing brain-machine interfaces (BMI) Existing methods often overlook semantic consistency and completeness within modalities and lack effective semantic alignment across modalities. We propose Neural-MCRL, a novel framework that achieves multimodal alignment through semantic bridging and cross-attention mechanisms.
arXiv Detail & Related papers (2024-12-23T07:02:44Z)
CognitionCapturer: Decoding Visual Stimuli From Human EEG Signal With Multimodal Information [61.1904164368732]
We propose CognitionCapturer, a unified framework that fully leverages multimodal data to represent EEG signals. Specifically, CognitionCapturer trains Modality Experts for each modality to extract cross-modal information from the EEG modality. The framework does not require any fine-tuning of the generative models and can be extended to incorporate more modalities.
arXiv Detail & Related papers (2024-12-13T16:27:54Z)
BrainDecoder: Style-Based Visual Decoding of EEG Signals [2.1611379205310506]
Decoding neural representations of visual stimuli from electroencephalography (EEG) offers valuable insights into brain activity and cognition. Recent advancements in deep learning have significantly enhanced the field of visual decoding of EEG. We present a novel visual decoding pipeline that emphasizes the reconstruction of the style, such as color and texture, of images viewed by the subject.
arXiv Detail & Related papers (2024-09-09T02:14:23Z)
GEM: Context-Aware Gaze EstiMation with Visual Search Behavior Matching for Chest Radiograph [32.1234295417225]
We propose a context-aware Gaze EstiMation (GEM) network that utilizes eye gaze data collected from radiologists to simulate their visual search behavior patterns. It consists of a context-awareness module, visual behavior graph construction, and visual behavior matching. Experiments on four publicly available datasets demonstrate the superiority of GEM over existing methods.
arXiv Detail & Related papers (2024-08-10T09:46:25Z)
MLIP: Enhancing Medical Visual Representation with Divergence Encoder and Knowledge-guided Contrastive Learning [48.97640824497327]
We propose a novel framework leveraging domain-specific medical knowledge as guiding signals to integrate language information into the visual domain through image-text contrastive learning. Our model includes global contrastive learning with our designed divergence encoder, local token-knowledge-patch alignment contrastive learning, and knowledge-guided category-level contrastive learning with expert knowledge. Notably, MLIP surpasses state-of-the-art methods even with limited annotated data, highlighting the potential of multimodal pre-training in advancing medical representation learning.
arXiv Detail & Related papers (2024-02-03T05:48:50Z)
Learning Robust Deep Visual Representations from EEG Brain Recordings [13.768240137063428]
This study proposes a two-stage method where the first step is to obtain EEG-derived features for robust learning of deep representations. We demonstrate the generalizability of our feature extraction pipeline across three different datasets using deep-learning architectures. We propose a novel framework to transform unseen images into the EEG space and reconstruct them with approximation.
arXiv Detail & Related papers (2023-10-25T10:26:07Z)
A Knowledge-Driven Cross-view Contrastive Learning for EEG Representation [48.85731427874065]
This paper proposes a knowledge-driven cross-view contrastive learning framework (KDC2) to extract effective representations from EEG with limited labels. The KDC2 method creates scalp and neural views of EEG signals, simulating the internal and external representation of brain activity. By modeling prior neural knowledge based on neural information consistency theory, the proposed method extracts invariant and complementary neural knowledge to generate combined representations.
arXiv Detail & Related papers (2023-09-21T08:53:51Z)
Controllable Mind Visual Diffusion Model [58.83896307930354]
Brain signal visualization has emerged as an active research area, serving as a critical interface between the human visual system and computer vision models. We propose a novel approach, referred to as Controllable Mind Visual Model Diffusion (CMVDM) CMVDM extracts semantic and silhouette information from fMRI data using attribute alignment and assistant networks. We then leverage a control model to fully exploit the extracted information for image synthesis, resulting in generated images that closely resemble the visual stimuli in terms of semantics and silhouette.
arXiv Detail & Related papers (2023-05-17T11:36:40Z)
See What You See: Self-supervised Cross-modal Retrieval of Visual Stimuli from Brain Activity [37.837710340954374]
We present a single-stage EEG-visual retrieval paradigm where data of two modalities are correlated, as opposed to their annotations. We demonstrate the proposed approach completes an instance-level EEG-visual retrieval task which existing methods cannot.
arXiv Detail & Related papers (2022-08-07T08:11:15Z)
Cross-modal Representation Learning for Zero-shot Action Recognition [67.57406812235767]
We present a cross-modal Transformer-based framework, which jointly encodes video data and text labels for zero-shot action recognition (ZSAR) Our model employs a conceptually new pipeline by which visual representations are learned in conjunction with visual-semantic associations in an end-to-end manner. Experiment results show our model considerably improves upon the state of the arts in ZSAR, reaching encouraging top-1 accuracy on UCF101, HMDB51, and ActivityNet benchmark datasets.
arXiv Detail & Related papers (2022-05-03T17:39:27Z)
CogAlign: Learning to Align Textual Neural Representations to Cognitive Language Processing Signals [60.921888445317705]
We propose a CogAlign approach to integrate cognitive language processing signals into natural language processing models. We show that CogAlign achieves significant improvements with multiple cognitive features over state-of-the-art models on public datasets.
arXiv Detail & Related papers (2021-06-10T07:10:25Z)
Pathological Retinal Region Segmentation From OCT Images Using Geometric Relation Based Augmentation [84.7571086566595]
We propose improvements over previous GAN-based medical image synthesis methods by jointly encoding the intrinsic relationship of geometry and shape. The proposed method outperforms state-of-the-art segmentation methods on the public RETOUCH dataset having images captured from different acquisition procedures.
arXiv Detail & Related papers (2020-03-31T11:50:43Z)

This list is automatically generated from the titles and abstracts of the papers in this site.