BrainVis: Exploring the Bridge between Brain and Visual Signals via
Image Reconstruction
- URL: http://arxiv.org/abs/2312.14871v1
- Date: Fri, 22 Dec 2023 17:49:11 GMT
- Title: BrainVis: Exploring the Bridge between Brain and Visual Signals via
Image Reconstruction
- Authors: Honghao Fu, Zhiqi Shen, Jing Jih Chin, Hao Wang
- Abstract summary: We propose a novel approach to reconstruct visual stimuli from EEG signals.
We apply a self-supervised approach on the EEG signals to obtain EEG time-domain features.
We also align EEG time-frequency embeddings with the coarse and fine-grained semantics in the CLIP space.
Our proposed BrainVis outperforms state of the arts in both semantic fidelity reconstruction and generation quality.
- Score: 8.206564266319388
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Analyzing and reconstructing visual stimuli from brain signals effectively
advances understanding of the human visual system. However, the EEG signals are
complex and contain a amount of noise. This leads to substantial limitations in
existing works of visual stimuli reconstruction from EEG, such as difficulties
in aligning EEG embeddings with the fine-grained semantic information and a
heavy reliance on additional large self-collected dataset for training. To
address these challenges, we propose a novel approach called BrainVis. Firstly,
we divide the EEG signals into various units and apply a self-supervised
approach on them to obtain EEG time-domain features, in an attempt to ease the
training difficulty. Additionally, we also propose to utilize the
frequency-domain features to enhance the EEG representations. Then, we
simultaneously align EEG time-frequency embeddings with the interpolation of
the coarse and fine-grained semantics in the CLIP space, to highlight the
primary visual components and reduce the cross-modal alignment difficulty.
Finally, we adopt the cascaded diffusion models to reconstruct images. Our
proposed BrainVis outperforms state of the arts in both semantic fidelity
reconstruction and generation quality. Notably, we reduce the training data
scale to 10% of the previous work.
Related papers
- MindFormer: A Transformer Architecture for Multi-Subject Brain Decoding via fMRI [50.55024115943266]
We introduce a new Transformer architecture called MindFormer to generate fMRI-conditioned feature vectors.
MindFormer incorporates two key innovations: 1) a novel training strategy based on the IP-Adapter to extract semantically meaningful features from fMRI signals, and 2) a subject specific token and linear layer that effectively capture individual differences in fMRI signals.
arXiv Detail & Related papers (2024-05-28T00:36:25Z) - Reconstructing Visual Stimulus Images from EEG Signals Based on Deep
Visual Representation Model [5.483279087074447]
We propose a novel image reconstruction method based on EEG signals in this paper.
To satisfy the high recognizability of visual stimulus images in fast switching manner, we build a visual stimuli image dataset.
Deep visual representation model(DVRM) consisting of a primary encoder and a subordinate decoder is proposed to reconstruct visual stimuli.
arXiv Detail & Related papers (2024-03-11T09:19:09Z) - Learning Robust Deep Visual Representations from EEG Brain Recordings [13.768240137063428]
This study proposes a two-stage method where the first step is to obtain EEG-derived features for robust learning of deep representations.
We demonstrate the generalizability of our feature extraction pipeline across three different datasets using deep-learning architectures.
We propose a novel framework to transform unseen images into the EEG space and reconstruct them with approximation.
arXiv Detail & Related papers (2023-10-25T10:26:07Z) - A Knowledge-Driven Cross-view Contrastive Learning for EEG
Representation [48.85731427874065]
This paper proposes a knowledge-driven cross-view contrastive learning framework (KDC2) to extract effective representations from EEG with limited labels.
The KDC2 method creates scalp and neural views of EEG signals, simulating the internal and external representation of brain activity.
By modeling prior neural knowledge based on neural information consistency theory, the proposed method extracts invariant and complementary neural knowledge to generate combined representations.
arXiv Detail & Related papers (2023-09-21T08:53:51Z) - Disruptive Autoencoders: Leveraging Low-level features for 3D Medical
Image Pre-training [51.16994853817024]
This work focuses on designing an effective pre-training framework for 3D radiology images.
We introduce Disruptive Autoencoders, a pre-training framework that attempts to reconstruct the original image from disruptions created by a combination of local masking and low-level perturbations.
The proposed pre-training framework is tested across multiple downstream tasks and achieves state-of-the-art performance.
arXiv Detail & Related papers (2023-07-31T17:59:42Z) - Seeing through the Brain: Image Reconstruction of Visual Perception from
Human Brain Signals [27.92796103924193]
We propose a comprehensive pipeline, named NeuroImagen, for reconstructing visual stimuli images from EEG signals.
We incorporate a novel multi-level perceptual information decoding to draw multi-grained outputs from the given EEG data.
arXiv Detail & Related papers (2023-07-27T12:54:16Z) - Joint fMRI Decoding and Encoding with Latent Embedding Alignment [77.66508125297754]
We introduce a unified framework that addresses both fMRI decoding and encoding.
Our model concurrently recovers visual stimuli from fMRI signals and predicts brain activity from images within a unified framework.
arXiv Detail & Related papers (2023-03-26T14:14:58Z) - BrainCLIP: Bridging Brain and Visual-Linguistic Representation Via CLIP
for Generic Natural Visual Stimulus Decoding [51.911473457195555]
BrainCLIP is a task-agnostic fMRI-based brain decoding model.
It bridges the modality gap between brain activity, image, and text.
BrainCLIP can reconstruct visual stimuli with high semantic fidelity.
arXiv Detail & Related papers (2023-02-25T03:28:54Z) - fMRI from EEG is only Deep Learning away: the use of interpretable DL to
unravel EEG-fMRI relationships [68.8204255655161]
We present an interpretable domain grounded solution to recover the activity of several subcortical regions from multichannel EEG data.
We recover individual spatial and time-frequency patterns of scalp EEG predictive of the hemodynamic signal in the subcortical nuclei.
arXiv Detail & Related papers (2022-10-23T15:11:37Z) - Mind Reader: Reconstructing complex images from brain activities [16.78619734818198]
We focus on reconstructing the complex image stimuli from fMRI (functional magnetic resonance imaging) signals.
Unlike previous works that reconstruct images with single objects or simple shapes, our work aims to reconstruct image stimuli rich in semantics.
We find that incorporating an additional text modality is beneficial for the reconstruction problem compared to directly translating brain signals to images.
arXiv Detail & Related papers (2022-09-30T06:32:46Z) - See What You See: Self-supervised Cross-modal Retrieval of Visual
Stimuli from Brain Activity [37.837710340954374]
We present a single-stage EEG-visual retrieval paradigm where data of two modalities are correlated, as opposed to their annotations.
We demonstrate the proposed approach completes an instance-level EEG-visual retrieval task which existing methods cannot.
arXiv Detail & Related papers (2022-08-07T08:11:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.