Related papers: ViEEG: Hierarchical Neural Coding with Cross-Modal Progressive Enhancement for EEG-Based Visual Decoding

ViEEG: Hierarchical Neural Coding with Cross-Modal Progressive Enhancement for EEG-Based Visual Decoding

URL: http://arxiv.org/abs/2505.12408v2
Date: Sun, 25 May 2025 14:25:21 GMT
Title: ViEEG: Hierarchical Neural Coding with Cross-Modal Progressive Enhancement for EEG-Based Visual Decoding
Authors: Minxu Liu, Donghai Guan, Chuhang Zheng, Chunwei Tian, Jie Wen, Qi Zhu,
Abstract summary: ViEEG is a biologically inspired hierarchical EEG decoding framework that aligns with the Hubel-Wiesel theory of visual processing.<n>Our framework achieves state-of-the-art performance, with 40.9% Top-1 accuracy in subject-dependent and 22.9% Top-1 accuracy in cross-subject settings, surpassing existing methods by over 45%.
Score: 14.18190036916225
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Understanding and decoding brain activity into visual representations is a fundamental challenge at the intersection of neuroscience and artificial intelligence. While EEG-based visual decoding has shown promise due to its non-invasive, low-cost nature and millisecond-level temporal resolution, existing methods are limited by their reliance on flat neural representations that overlook the brain's inherent visual hierarchy. In this paper, we introduce ViEEG, a biologically inspired hierarchical EEG decoding framework that aligns with the Hubel-Wiesel theory of visual processing. ViEEG decomposes each visual stimulus into three biologically aligned components-contour, foreground object, and contextual scene-serving as anchors for a three-stream EEG encoder. These EEG features are progressively integrated via cross-attention routing, simulating cortical information flow from V1 to IT to the association cortex. We further adopt hierarchical contrastive learning to align EEG representations with CLIP embeddings, enabling zero-shot object recognition. Extensive experiments on the THINGS-EEG dataset demonstrate that ViEEG achieves state-of-the-art performance, with 40.9% Top-1 accuracy in subject-dependent and 22.9% Top-1 accuracy in cross-subject settings, surpassing existing methods by over 45%. Our framework not only advances the performance frontier but also sets a new paradigm for biologically grounded brain decoding in AI.

Related papers

NeuroCLIP: Brain-Inspired Prompt Tuning for EEG-to-Image Multimodal Contrastive Learning [13.254096454986318]
We present NeuroCLIP, a prompt tuning framework tailored for EEG-to-image contrastive learning.<n>We are the first to introduce visual prompt tokens into EEG-image alignment, acting as global, modality-level prompts.<n>On the THINGS-EEG2 dataset, NeuroCLIP achieves a Top-1 accuracy of 63.2% in zero-shot image retrieval.
arXiv Detail & Related papers (2025-11-12T12:13:24Z)
Moving Beyond Diffusion: Hierarchy-to-Hierarchy Autoregression for fMRI-to-Image Reconstruction [65.67001243986981]
We propose MindHier, a coarse-to-fine fMRI-to-image reconstruction framework built on scale-wise autoregressive modeling.<n>MindHier achieves superior semantic fidelity, 4.67x faster inference, and more deterministic results than the diffusion-based baselines.
arXiv Detail & Related papers (2025-10-25T15:40:07Z)
Spatial-Functional awareness Transformer-based graph archetype contrastive learning for Decoding Visual Neural Representations from EEG [3.661246946935037]
We propose a Spatial-Functional Awareness Transformer-based Graph Archetype Contrastive Learning (SFTG) framework to enhance EEG-based visual decoding.<n>Specifically, we introduce the EEG Graph Transformer (EGT), a novel graph-based neural architecture that simultaneously encodes spatial brain connectivity and temporal neural dynamics.<n>To mitigate high intra-subject variability, we propose Graph Archetype Contrastive Learning (GAC), which learns subject-specific EEG graph archetypes to improve feature consistency and class separability.
arXiv Detail & Related papers (2025-09-29T13:27:55Z)
CATVis: Context-Aware Thought Visualization [2.8298952038412706]
We propose a novel 5-stage framework for decoding visual representations from EEG signals.<n>We enable context-aware EEG-to-image generation through cross-modal alignment and re-ranking.<n> Experimental results demonstrate that our method generates high-quality images aligned with visual stimuli.
arXiv Detail & Related papers (2025-07-15T17:47:01Z)
Interpretable EEG-to-Image Generation with Semantic Prompts [6.712646807032639]
Our model bypasses direct EEG-to-image generation by aligning EEG signals with semantic captions.<n>A transformer-based EEG encoder maps brain activity to these captions through contrastive learning.<n>This text-mediated framework yields state-of-the-art visual decoding on the EEGCVPR dataset.
arXiv Detail & Related papers (2025-07-09T17:18:06Z)
Learning Interpretable Representations Leads to Semantically Faithful EEG-to-Text Generation [52.51005875755718]
We focus on EEG-to-text decoding and address its hallucination issue through the lens of posterior collapse.<n>Acknowledging the underlying mismatch in information capacity between EEG and text, we reframe the decoding task as semantic summarization of core meanings.<n>Experiments on the public ZuCo dataset demonstrate that GLIM consistently generates fluent, EEG-grounded sentences.
arXiv Detail & Related papers (2025-05-21T05:29:55Z)
CognitionCapturer: Decoding Visual Stimuli From Human EEG Signal With Multimodal Information [61.1904164368732]
We propose CognitionCapturer, a unified framework that fully leverages multimodal data to represent EEG signals.<n>Specifically, CognitionCapturer trains Modality Experts for each modality to extract cross-modal information from the EEG modality.<n>The framework does not require any fine-tuning of the generative models and can be extended to incorporate more modalities.
arXiv Detail & Related papers (2024-12-13T16:27:54Z)
Neuro-3D: Towards 3D Visual Decoding from EEG Signals [49.502364730056044]
We introduce a new neuroscience task: decoding 3D visual perception from EEG signals. We first present EEG-3D, a dataset featuring multimodal analysis data and EEG recordings from 12 subjects viewing 72 categories of 3D objects rendered in both videos and images. We propose Neuro-3D, a 3D visual decoding framework based on EEG signals.
arXiv Detail & Related papers (2024-11-19T05:52:17Z)
Enhancing EEG-to-Text Decoding through Transferable Representations from Pre-trained Contrastive EEG-Text Masked Autoencoder [69.7813498468116]
We propose Contrastive EEG-Text Masked Autoencoder (CET-MAE), a novel model that orchestrates compound self-supervised learning across and within EEG and text. We also develop a framework called E2T-PTR (EEG-to-Text decoding using Pretrained Transferable Representations) to decode text from EEG sequences.
arXiv Detail & Related papers (2024-02-27T11:45:21Z)
BrainVis: Exploring the Bridge between Brain and Visual Signals via Image Reconstruction [7.512223286737468]
Analyzing and reconstructing visual stimuli from brain signals effectively advances the understanding of human visual system. However, the EEG signals are complex and contain significant noise. This leads to substantial limitations in existing works of visual stimuli reconstruction from EEG. We propose a novel approach called BrainVis to address these challenges.
arXiv Detail & Related papers (2023-12-22T17:49:11Z)
Learning Robust Deep Visual Representations from EEG Brain Recordings [13.768240137063428]
This study proposes a two-stage method where the first step is to obtain EEG-derived features for robust learning of deep representations. We demonstrate the generalizability of our feature extraction pipeline across three different datasets using deep-learning architectures. We propose a novel framework to transform unseen images into the EEG space and reconstruct them with approximation.
arXiv Detail & Related papers (2023-10-25T10:26:07Z)
A Knowledge-Driven Cross-view Contrastive Learning for EEG Representation [48.85731427874065]
This paper proposes a knowledge-driven cross-view contrastive learning framework (KDC2) to extract effective representations from EEG with limited labels. The KDC2 method creates scalp and neural views of EEG signals, simulating the internal and external representation of brain activity. By modeling prior neural knowledge based on neural information consistency theory, the proposed method extracts invariant and complementary neural knowledge to generate combined representations.
arXiv Detail & Related papers (2023-09-21T08:53:51Z)
Decoding Natural Images from EEG for Object Recognition [8.411976038504589]
This paper presents a self-supervised framework to demonstrate the feasibility of learning image representations from EEG signals. We achieve a top-1 accuracy of 15.6% and a top-5 accuracy of 42.8% in challenging 200-way zero-shot tasks. These findings yield valuable insights for neural decoding and brain-computer interfaces in real-world scenarios.
arXiv Detail & Related papers (2023-08-25T08:05:37Z)
Joint fMRI Decoding and Encoding with Latent Embedding Alignment [77.66508125297754]
We introduce a unified framework that addresses both fMRI decoding and encoding. Our model concurrently recovers visual stimuli from fMRI signals and predicts brain activity from images within a unified framework.
arXiv Detail & Related papers (2023-03-26T14:14:58Z)
BrainCLIP: Bridging Brain and Visual-Linguistic Representation Via CLIP for Generic Natural Visual Stimulus Decoding [51.911473457195555]
BrainCLIP is a task-agnostic fMRI-based brain decoding model. It bridges the modality gap between brain activity, image, and text. BrainCLIP can reconstruct visual stimuli with high semantic fidelity.
arXiv Detail & Related papers (2023-02-25T03:28:54Z)

This list is automatically generated from the titles and abstracts of the papers in this site.