Reconstructing seen images from human brain activity via guided
stochastic search
- URL: http://arxiv.org/abs/2305.00556v2
- Date: Tue, 2 May 2023 00:54:12 GMT
- Title: Reconstructing seen images from human brain activity via guided
stochastic search
- Authors: Reese Kneeland, Jordyn Ojeda, Ghislain St-Yves, Thomas Naselaris
- Abstract summary: We use conditional generative diffusion models to extend and improve visual reconstruction algorithms.
We decode a semantic descriptor from human brain activity (7T fMRI) in voxels across most of visual cortex.
We show that this process converges on high-quality reconstructions by refining low-level image details while preserving semantic content.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Visual reconstruction algorithms are an interpretive tool that map brain
activity to pixels. Past reconstruction algorithms employed brute-force search
through a massive library to select candidate images that, when passed through
an encoding model, accurately predict brain activity. Here, we use conditional
generative diffusion models to extend and improve this search-based strategy.
We decode a semantic descriptor from human brain activity (7T fMRI) in voxels
across most of visual cortex, then use a diffusion model to sample a small
library of images conditioned on this descriptor. We pass each sample through
an encoding model, select the images that best predict brain activity, and then
use these images to seed another library. We show that this process converges
on high-quality reconstructions by refining low-level image details while
preserving semantic content across iterations. Interestingly, the
time-to-convergence differs systematically across visual cortex, suggesting a
succinct new way to measure the diversity of representations across visual
brain areas.
Related papers
- Towards Retrieval-Augmented Architectures for Image Captioning [81.11529834508424]
This work presents a novel approach towards developing image captioning models that utilize an external kNN memory to improve the generation process.
Specifically, we propose two model variants that incorporate a knowledge retriever component that is based on visual similarities.
We experimentally validate our approach on COCO and nocaps datasets and demonstrate that incorporating an explicit external memory can significantly enhance the quality of captions.
arXiv Detail & Related papers (2024-05-21T18:02:07Z) - Learning Multimodal Volumetric Features for Large-Scale Neuron Tracing [72.45257414889478]
We aim to reduce human workload by predicting connectivity between over-segmented neuron pieces.
We first construct a dataset, named FlyTracing, that contains millions of pairwise connections of segments expanding the whole fly brain.
We propose a novel connectivity-aware contrastive learning method to generate dense volumetric EM image embedding.
arXiv Detail & Related papers (2024-01-05T19:45:12Z) - Brain-optimized inference improves reconstructions of fMRI brain
activity [0.0]
We evaluate the prospect of further improving recent decoding methods by optimizing for consistency between reconstructions and brain activity during inference.
We sample seed reconstructions from a base decoding method, then iteratively refine these reconstructions using a brain-optimized encoding model.
We show that reconstruction quality can be significantly improved by explicitly aligning decoding to brain activity distributions.
arXiv Detail & Related papers (2023-12-12T20:08:59Z) - UniBrain: Unify Image Reconstruction and Captioning All in One Diffusion
Model from Human Brain Activity [2.666777614876322]
We propose UniBrain: Unify Image Reconstruction and Captioning All in One Diffusion Model from Human Brain Activity.
We transform fMRI voxels into text and image latent for low-level information to generate realistic captions and images.
UniBrain outperforms current methods both qualitatively and quantitatively in terms of image reconstruction and reports image captioning results for the first time on the Natural Scenes dataset.
arXiv Detail & Related papers (2023-08-14T19:49:29Z) - Second Sight: Using brain-optimized encoding models to align image
distributions with human brain activity [0.0]
We introduce a novel reconstruction procedure (Second Sight) that iteratively refines an image distribution to maximize the alignment between the predictions of a voxel-wise encoding model and the brain activity patterns evoked by any target image.
We show that our process converges on a distribution of high-quality reconstructions by refining both semantic content and low-level image details across iterations.
arXiv Detail & Related papers (2023-06-01T17:31:07Z) - Brain Captioning: Decoding human brain activity into images and text [1.5486926490986461]
We present an innovative method for decoding brain activity into meaningful images and captions.
Our approach takes advantage of cutting-edge image captioning models and incorporates a unique image reconstruction pipeline.
We evaluate our methods using quantitative metrics for both generated captions and images.
arXiv Detail & Related papers (2023-05-19T09:57:19Z) - Controllable Mind Visual Diffusion Model [58.83896307930354]
Brain signal visualization has emerged as an active research area, serving as a critical interface between the human visual system and computer vision models.
We propose a novel approach, referred to as Controllable Mind Visual Model Diffusion (CMVDM)
CMVDM extracts semantic and silhouette information from fMRI data using attribute alignment and assistant networks.
We then leverage a control model to fully exploit the extracted information for image synthesis, resulting in generated images that closely resemble the visual stimuli in terms of semantics and silhouette.
arXiv Detail & Related papers (2023-05-17T11:36:40Z) - Semantic Brain Decoding: from fMRI to conceptually similar image
reconstruction of visual stimuli [0.29005223064604074]
We propose a novel approach to brain decoding that also relies on semantic and contextual similarity.
We employ an fMRI dataset of natural image vision and create a deep learning decoding pipeline inspired by the existence of both bottom-up and top-down processes in human vision.
We produce reconstructions of visual stimuli that match the original content very well on a semantic level, surpassing the state of the art in previous literature.
arXiv Detail & Related papers (2022-12-13T16:54:08Z) - NAS-DIP: Learning Deep Image Prior with Neural Architecture Search [65.79109790446257]
Recent work has shown that the structure of deep convolutional neural networks can be used as a structured image prior.
We propose to search for neural architectures that capture stronger image priors.
We search for an improved network by leveraging an existing neural architecture search algorithm.
arXiv Detail & Related papers (2020-08-26T17:59:36Z) - Improved Slice-wise Tumour Detection in Brain MRIs by Computing
Dissimilarities between Latent Representations [68.8204255655161]
Anomaly detection for Magnetic Resonance Images (MRIs) can be solved with unsupervised methods.
We have proposed a slice-wise semi-supervised method for tumour detection based on the computation of a dissimilarity function in the latent space of a Variational AutoEncoder.
We show that by training the models on higher resolution images and by improving the quality of the reconstructions, we obtain results which are comparable with different baselines.
arXiv Detail & Related papers (2020-07-24T14:02:09Z) - Neural Sparse Representation for Image Restoration [116.72107034624344]
Inspired by the robustness and efficiency of sparse coding based image restoration models, we investigate the sparsity of neurons in deep networks.
Our method structurally enforces sparsity constraints upon hidden neurons.
Experiments show that sparse representation is crucial in deep neural networks for multiple image restoration tasks.
arXiv Detail & Related papers (2020-06-08T05:15:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.