Semantic Brain Decoding: from fMRI to conceptually similar image
reconstruction of visual stimuli
- URL: http://arxiv.org/abs/2212.06726v2
- Date: Wed, 22 Mar 2023 11:17:29 GMT
- Title: Semantic Brain Decoding: from fMRI to conceptually similar image
reconstruction of visual stimuli
- Authors: Matteo Ferrante, Tommaso Boccato, Nicola Toschi
- Abstract summary: We propose a novel approach to brain decoding that also relies on semantic and contextual similarity.
We employ an fMRI dataset of natural image vision and create a deep learning decoding pipeline inspired by the existence of both bottom-up and top-down processes in human vision.
We produce reconstructions of visual stimuli that match the original content very well on a semantic level, surpassing the state of the art in previous literature.
- Score: 0.29005223064604074
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Brain decoding is a field of computational neuroscience that uses measurable
brain activity to infer mental states or internal representations of perceptual
inputs. Therefore, we propose a novel approach to brain decoding that also
relies on semantic and contextual similarity. We employ an fMRI dataset of
natural image vision and create a deep learning decoding pipeline inspired by
the existence of both bottom-up and top-down processes in human vision. We
train a linear brain-to-feature model to map fMRI activity features to visual
stimuli features, assuming that the brain projects visual information onto a
space that is homeomorphic to the latent space represented by the last
convolutional layer of a pretrained convolutional neural network, which
typically collects a variety of semantic features that summarize and highlight
similarities and differences between concepts. These features are then
categorized in the latent space using a nearest-neighbor strategy, and the
results are used to condition a generative latent diffusion model to create
novel images. From fMRI data only, we produce reconstructions of visual stimuli
that match the original content very well on a semantic level, surpassing the
state of the art in previous literature. We evaluate our work and obtain good
results using a quantitative semantic metric (the Wu-Palmer similarity metric
over the WordNet lexicon, which had an average value of 0.57) and perform a
human evaluation experiment that resulted in correct evaluation, according to
the multiplicity of human criteria in evaluating image similarity, in over 80%
of the test set.
Related papers
- Teaching CORnet Human fMRI Representations for Enhanced Model-Brain Alignment [2.035627332992055]
Functional magnetic resonance imaging (fMRI) as a widely used technique in cognitive neuroscience can record neural activation in the human visual cortex during the process of visual perception.
This study proposed ReAlnet-fMRI, a model based on the SOTA vision model CORnet but optimized using human fMRI data through a multi-layer encoding-based alignment framework.
The fMRI-optimized ReAlnet-fMRI exhibited higher similarity to the human brain than both CORnet and the control model in within-and across-subject as well as within- and across-modality model-brain (fMRI and EEG
arXiv Detail & Related papers (2024-07-15T03:31:42Z) - MindFormer: Semantic Alignment of Multi-Subject fMRI for Brain Decoding [50.55024115943266]
We introduce a novel semantic alignment method of multi-subject fMRI signals using so-called MindFormer.
This model is specifically designed to generate fMRI-conditioned feature vectors that can be used for conditioning Stable Diffusion model for fMRI- to-image generation or large language model (LLM) for fMRI-to-text generation.
Our experimental results demonstrate that MindFormer generates semantically consistent images and text across different subjects.
arXiv Detail & Related papers (2024-05-28T00:36:25Z) - Psychometry: An Omnifit Model for Image Reconstruction from Human Brain Activity [60.983327742457995]
Reconstructing the viewed images from human brain activity bridges human and computer vision through the Brain-Computer Interface.
We devise Psychometry, an omnifit model for reconstructing images from functional Magnetic Resonance Imaging (fMRI) obtained from different subjects.
arXiv Detail & Related papers (2024-03-29T07:16:34Z) - Decoding Realistic Images from Brain Activity with Contrastive
Self-supervision and Latent Diffusion [29.335943994256052]
Reconstructing visual stimuli from human brain activities provides a promising opportunity to advance our understanding of the brain's visual system.
We propose a two-phase framework named Contrast and Diffuse (CnD) to decode realistic images from functional magnetic resonance imaging (fMRI) recordings.
arXiv Detail & Related papers (2023-09-30T09:15:22Z) - Unidirectional brain-computer interface: Artificial neural network
encoding natural images to fMRI response in the visual cortex [12.1427193917406]
We propose an artificial neural network dubbed VISION to mimic the human brain and show how it can foster neuroscientific inquiries.
VISION successfully predicts human hemodynamic responses as fMRI voxel values to visual inputs with an accuracy exceeding state-of-the-art performance by 45%.
arXiv Detail & Related papers (2023-09-26T15:38:26Z) - UniBrain: Unify Image Reconstruction and Captioning All in One Diffusion
Model from Human Brain Activity [2.666777614876322]
We propose UniBrain: Unify Image Reconstruction and Captioning All in One Diffusion Model from Human Brain Activity.
We transform fMRI voxels into text and image latent for low-level information to generate realistic captions and images.
UniBrain outperforms current methods both qualitatively and quantitatively in terms of image reconstruction and reports image captioning results for the first time on the Natural Scenes dataset.
arXiv Detail & Related papers (2023-08-14T19:49:29Z) - Brain Captioning: Decoding human brain activity into images and text [1.5486926490986461]
We present an innovative method for decoding brain activity into meaningful images and captions.
Our approach takes advantage of cutting-edge image captioning models and incorporates a unique image reconstruction pipeline.
We evaluate our methods using quantitative metrics for both generated captions and images.
arXiv Detail & Related papers (2023-05-19T09:57:19Z) - Controllable Mind Visual Diffusion Model [58.83896307930354]
Brain signal visualization has emerged as an active research area, serving as a critical interface between the human visual system and computer vision models.
We propose a novel approach, referred to as Controllable Mind Visual Model Diffusion (CMVDM)
CMVDM extracts semantic and silhouette information from fMRI data using attribute alignment and assistant networks.
We then leverage a control model to fully exploit the extracted information for image synthesis, resulting in generated images that closely resemble the visual stimuli in terms of semantics and silhouette.
arXiv Detail & Related papers (2023-05-17T11:36:40Z) - Joint fMRI Decoding and Encoding with Latent Embedding Alignment [77.66508125297754]
We introduce a unified framework that addresses both fMRI decoding and encoding.
Our model concurrently recovers visual stimuli from fMRI signals and predicts brain activity from images within a unified framework.
arXiv Detail & Related papers (2023-03-26T14:14:58Z) - BrainCLIP: Bridging Brain and Visual-Linguistic Representation Via CLIP
for Generic Natural Visual Stimulus Decoding [51.911473457195555]
BrainCLIP is a task-agnostic fMRI-based brain decoding model.
It bridges the modality gap between brain activity, image, and text.
BrainCLIP can reconstruct visual stimuli with high semantic fidelity.
arXiv Detail & Related papers (2023-02-25T03:28:54Z) - Functional2Structural: Cross-Modality Brain Networks Representation
Learning [55.24969686433101]
Graph mining on brain networks may facilitate the discovery of novel biomarkers for clinical phenotypes and neurodegenerative diseases.
We propose a novel graph learning framework, known as Deep Signed Brain Networks (DSBN), with a signed graph encoder.
We validate our framework on clinical phenotype and neurodegenerative disease prediction tasks using two independent, publicly available datasets.
arXiv Detail & Related papers (2022-05-06T03:45:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.