Second Sight: Using brain-optimized encoding models to align image
distributions with human brain activity
- URL: http://arxiv.org/abs/2306.00927v1
- Date: Thu, 1 Jun 2023 17:31:07 GMT
- Title: Second Sight: Using brain-optimized encoding models to align image
distributions with human brain activity
- Authors: Reese Kneeland, Jordyn Ojeda, Ghislain St-Yves, Thomas Naselaris
- Abstract summary: We introduce a novel reconstruction procedure (Second Sight) that iteratively refines an image distribution to maximize the alignment between the predictions of a voxel-wise encoding model and the brain activity patterns evoked by any target image.
We show that our process converges on a distribution of high-quality reconstructions by refining both semantic content and low-level image details across iterations.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Two recent developments have accelerated progress in image reconstruction
from human brain activity: large datasets that offer samples of brain activity
in response to many thousands of natural scenes, and the open-sourcing of
powerful stochastic image-generators that accept both low- and high-level
guidance. Most work in this space has focused on obtaining point estimates of
the target image, with the ultimate goal of approximating literal pixel-wise
reconstructions of target images from the brain activity patterns they evoke.
This emphasis belies the fact that there is always a family of images that are
equally compatible with any evoked brain activity pattern, and the fact that
many image-generators are inherently stochastic and do not by themselves offer
a method for selecting the single best reconstruction from among the samples
they generate. We introduce a novel reconstruction procedure (Second Sight)
that iteratively refines an image distribution to explicitly maximize the
alignment between the predictions of a voxel-wise encoding model and the brain
activity patterns evoked by any target image. We show that our process
converges on a distribution of high-quality reconstructions by refining both
semantic content and low-level image details across iterations. Images sampled
from these converged image distributions are competitive with state-of-the-art
reconstruction algorithms. Interestingly, the time-to-convergence varies
systematically across visual cortex, with earlier visual areas generally taking
longer and converging on narrower image distributions, relative to higher-level
brain areas. Second Sight thus offers a succinct and novel method for exploring
the diversity of representations across visual brain areas.
Related papers
- FreeCompose: Generic Zero-Shot Image Composition with Diffusion Prior [50.0535198082903]
We offer a novel approach to image composition, which integrates multiple input images into a single, coherent image.
We showcase the potential of utilizing the powerful generative prior inherent in large-scale pre-trained diffusion models to accomplish generic image composition.
arXiv Detail & Related papers (2024-07-06T03:35:43Z) - Brain-optimized inference improves reconstructions of fMRI brain
activity [0.0]
We evaluate the prospect of further improving recent decoding methods by optimizing for consistency between reconstructions and brain activity during inference.
We sample seed reconstructions from a base decoding method, then iteratively refine these reconstructions using a brain-optimized encoding model.
We show that reconstruction quality can be significantly improved by explicitly aligning decoding to brain activity distributions.
arXiv Detail & Related papers (2023-12-12T20:08:59Z) - Brain Captioning: Decoding human brain activity into images and text [1.5486926490986461]
We present an innovative method for decoding brain activity into meaningful images and captions.
Our approach takes advantage of cutting-edge image captioning models and incorporates a unique image reconstruction pipeline.
We evaluate our methods using quantitative metrics for both generated captions and images.
arXiv Detail & Related papers (2023-05-19T09:57:19Z) - Reconstructing seen images from human brain activity via guided
stochastic search [0.0]
We use conditional generative diffusion models to extend and improve visual reconstruction algorithms.
We decode a semantic descriptor from human brain activity (7T fMRI) in voxels across most of visual cortex.
We show that this process converges on high-quality reconstructions by refining low-level image details while preserving semantic content.
arXiv Detail & Related papers (2023-04-30T19:40:01Z) - GM-NeRF: Learning Generalizable Model-based Neural Radiance Fields from
Multi-view Images [79.39247661907397]
We introduce an effective framework Generalizable Model-based Neural Radiance Fields to synthesize free-viewpoint images.
Specifically, we propose a geometry-guided attention mechanism to register the appearance code from multi-view 2D images to a geometry proxy.
arXiv Detail & Related papers (2023-03-24T03:32:02Z) - Multiscale Voxel Based Decoding For Enhanced Natural Image
Reconstruction From Brain Activity [0.22940141855172028]
We present a novel approach for enhanced image reconstruction, in which existing methods for object decoding and image reconstruction are merged together.
This is achieved by conditioning the reconstructed image to its decoded image category using a class-conditional generative adversarial network and neural style transfer.
The results indicate that our approach improves the semantic similarity of the reconstructed images and can be used as a general framework for enhanced image reconstruction.
arXiv Detail & Related papers (2022-05-27T18:09:07Z) - Learning Enriched Features for Fast Image Restoration and Enhancement [166.17296369600774]
This paper presents a holistic goal of maintaining spatially-precise high-resolution representations through the entire network.
We learn an enriched set of features that combines contextual information from multiple scales, while simultaneously preserving the high-resolution spatial details.
Our approach achieves state-of-the-art results for a variety of image processing tasks, including defocus deblurring, image denoising, super-resolution, and image enhancement.
arXiv Detail & Related papers (2022-04-19T17:59:45Z) - Multi-modal Aggregation Network for Fast MR Imaging [85.25000133194762]
We propose a novel Multi-modal Aggregation Network, named MANet, which is capable of discovering complementary representations from a fully sampled auxiliary modality.
In our MANet, the representations from the fully sampled auxiliary and undersampled target modalities are learned independently through a specific network.
Our MANet follows a hybrid domain learning framework, which allows it to simultaneously recover the frequency signal in the $k$-space domain.
arXiv Detail & Related papers (2021-10-15T13:16:59Z) - Learned Spatial Representations for Few-shot Talking-Head Synthesis [68.3787368024951]
We propose a novel approach for few-shot talking-head synthesis.
We show that this disentangled representation leads to a significant improvement over previous methods.
arXiv Detail & Related papers (2021-04-29T17:59:42Z) - Learning Enriched Features for Real Image Restoration and Enhancement [166.17296369600774]
convolutional neural networks (CNNs) have achieved dramatic improvements over conventional approaches for image restoration task.
We present a novel architecture with the collective goals of maintaining spatially-precise high-resolution representations through the entire network.
Our approach learns an enriched set of features that combines contextual information from multiple scales, while simultaneously preserving the high-resolution spatial details.
arXiv Detail & Related papers (2020-03-15T11:04:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.