Related papers: NeuralDiffuser: Neuroscience-inspired Diffusion Guidance for fMRI Visual Reconstruction

NeuralDiffuser: Neuroscience-inspired Diffusion Guidance for fMRI Visual Reconstruction

URL: http://arxiv.org/abs/2402.13809v3
Date: Wed, 08 Jan 2025 14:21:46 GMT
Title: NeuralDiffuser: Neuroscience-inspired Diffusion Guidance for fMRI Visual Reconstruction
Authors: Haoyu Li, Hao Wu, Badong Chen,
Abstract summary: We propose NeuralDiffuser, which incorporates primary visual feature guidance to provide detailed cues in the form of gradients.<n>This extension of the bottom-up process for diffusion models achieves both semantic coherence and detail fidelity when reconstructing visual stimuli.
Score: 25.987801733791986
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Reconstructing visual stimuli from functional Magnetic Resonance Imaging fMRI enables fine-grained retrieval of brain activity. However, the accurate reconstruction of diverse details, including structure, background, texture, color, and more, remains challenging. The stable diffusion models inevitably result in the variability of reconstructed images, even under identical conditions. To address this challenge, we first uncover the neuroscientific perspective of diffusion methods, which primarily involve top-down creation using pre-trained knowledge from extensive image datasets, but tend to lack detail-driven bottom-up perception, leading to a loss of faithful details. In this paper, we propose NeuralDiffuser, which incorporates primary visual feature guidance to provide detailed cues in the form of gradients. This extension of the bottom-up process for diffusion models achieves both semantic coherence and detail fidelity when reconstructing visual stimuli. Furthermore, we have developed a novel guidance strategy for reconstruction tasks that ensures the consistency of repeated outputs with original images rather than with various outputs. Extensive experimental results on the Natural Senses Dataset (NSD) qualitatively and quantitatively demonstrate the advancement of NeuralDiffuser by comparing it against baseline and state-of-the-art methods horizontally, as well as conducting longitudinal ablation studies.

Related papers

Brain-Streams: fMRI-to-Image Reconstruction with Multi-modal Guidance [3.74142789780782]
We show how modern LDMs incorporate multi-modal guidance for structurally and semantically plausible image generations. Brain-Streams maps fMRI signals from brain regions to appropriate embeddings. We validate the reconstruction ability of Brain-Streams both quantitatively and qualitatively on a real fMRI dataset.
arXiv Detail & Related papers (2024-09-18T16:19:57Z)
Generating Content for HDR Deghosting from Frequency View [56.103761824603644]
Recent Diffusion Models (DMs) have been introduced in HDR imaging field. DMs require extensive iterations with large models to estimate entire images. We propose the Low-Frequency aware Diffusion (LF-Diff) model for ghost-free HDR imaging.
arXiv Detail & Related papers (2024-04-01T01:32:11Z)
NeuroPictor: Refining fMRI-to-Image Reconstruction via Multi-individual Pretraining and Multi-level Modulation [55.51412454263856]
This paper proposes to directly modulate the generation process of diffusion models using fMRI signals. By training with about 67,000 fMRI-image pairs from various individuals, our model enjoys superior fMRI-to-image decoding capacity.
arXiv Detail & Related papers (2024-03-27T02:42:52Z)
Diffusion Priors for Dynamic View Synthesis from Monocular Videos [59.42406064983643]
Dynamic novel view synthesis aims to capture the temporal evolution of visual content within videos. We first finetune a pretrained RGB-D diffusion model on the video frames using a customization technique. We distill the knowledge from the finetuned model to a 4D representations encompassing both dynamic and static Neural Radiance Fields.
arXiv Detail & Related papers (2024-01-10T23:26:41Z)
Reti-Diff: Illumination Degradation Image Restoration with Retinex-based Latent Diffusion Model [59.08821399652483]
Illumination degradation image restoration (IDIR) techniques aim to improve the visibility of degraded images and mitigate the adverse effects of deteriorated illumination. Among these algorithms, diffusion model (DM)-based methods have shown promising performance but are often burdened by heavy computational demands and pixel misalignment issues when predicting the image-level distribution. We propose to leverage DM within a compact latent space to generate concise guidance priors and introduce a novel solution called Reti-Diff for the IDIR task. Reti-Diff comprises two key components: the Retinex-based latent DM (RLDM) and the Retinex-guided transformer (RG
arXiv Detail & Related papers (2023-11-20T09:55:06Z)
Steerable Conditional Diffusion for Out-of-Distribution Adaptation in Medical Image Reconstruction [75.91471250967703]
We introduce a novel sampling framework called Steerable Conditional Diffusion. This framework adapts the diffusion model, concurrently with image reconstruction, based solely on the information provided by the available measurement. We achieve substantial enhancements in out-of-distribution performance across diverse imaging modalities.
arXiv Detail & Related papers (2023-08-28T08:47:06Z)
Diffusion Models for Image Restoration and Enhancement -- A Comprehensive Survey [96.99328714941657]
We present a comprehensive review of recent diffusion model-based methods on image restoration. We classify and emphasize the innovative designs using diffusion models for both IR and blind/real-world IR. We propose five potential and challenging directions for the future research of diffusion model-based IR.
arXiv Detail & Related papers (2023-08-18T08:40:38Z)
UniBrain: Unify Image Reconstruction and Captioning All in One Diffusion Model from Human Brain Activity [2.666777614876322]
We propose UniBrain: Unify Image Reconstruction and Captioning All in One Diffusion Model from Human Brain Activity. We transform fMRI voxels into text and image latent for low-level information to generate realistic captions and images. UniBrain outperforms current methods both qualitatively and quantitatively in terms of image reconstruction and reports image captioning results for the first time on the Natural Scenes dataset.
arXiv Detail & Related papers (2023-08-14T19:49:29Z)
MindDiffuser: Controlled Image Reconstruction from Human Brain Activity with Semantic and Structural Diffusion [7.597218661195779]
We propose a two-stage image reconstruction model called MindDiffuser. In Stage 1, the VQ-VAE latent representations and the CLIP text embeddings decoded from fMRI are put into Stable Diffusion. In Stage 2, we utilize the CLIP visual feature decoded from fMRI as supervisory information, and continually adjust the two feature vectors decoded in Stage 1 through backpagation to align the structural information.
arXiv Detail & Related papers (2023-08-08T13:28:34Z)
Controllable Mind Visual Diffusion Model [58.83896307930354]
Brain signal visualization has emerged as an active research area, serving as a critical interface between the human visual system and computer vision models. We propose a novel approach, referred to as Controllable Mind Visual Model Diffusion (CMVDM) CMVDM extracts semantic and silhouette information from fMRI data using attribute alignment and assistant networks. We then leverage a control model to fully exploit the extracted information for image synthesis, resulting in generated images that closely resemble the visual stimuli in terms of semantics and silhouette.
arXiv Detail & Related papers (2023-05-17T11:36:40Z)
MindDiffuser: Controlled Image Reconstruction from Human Brain Activity with Semantic and Structural Diffusion [8.299415606889024]
We propose a two-stage image reconstruction model called MindDiffuser. In Stage 1, the VQ-VAE latent representations and the CLIP text embeddings decoded from fMRI are put into the image-to-image process of Stable Diffusion. In Stage 2, we utilize the low-level CLIP visual features decoded from fMRI as supervisory information.
arXiv Detail & Related papers (2023-03-24T16:41:42Z)
Natural scene reconstruction from fMRI signals using generative latent diffusion [1.90365714903665]
We present a two-stage scene reconstruction framework called Brain-Diffuser'' In the first stage, we reconstruct images that capture low-level properties and overall layout using a VDVAE (Very Deep Vari Autoencoder) model. In the second stage, we use the image-to-image framework of a latent diffusion model conditioned on predicted multimodal (text and visual) features.
arXiv Detail & Related papers (2023-03-09T15:24:26Z)
Multi-modal Aggregation Network for Fast MR Imaging [85.25000133194762]
We propose a novel Multi-modal Aggregation Network, named MANet, which is capable of discovering complementary representations from a fully sampled auxiliary modality. In our MANet, the representations from the fully sampled auxiliary and undersampled target modalities are learned independently through a specific network. Our MANet follows a hybrid domain learning framework, which allows it to simultaneously recover the frequency signal in the $k$-space domain.
arXiv Detail & Related papers (2021-10-15T13:16:59Z)
Multi-institutional Collaborations for Improving Deep Learning-based Magnetic Resonance Image Reconstruction Using Federated Learning [62.17532253489087]
Deep learning methods have been shown to produce superior performance on MR image reconstruction. These methods require large amounts of data which is difficult to collect and share due to the high cost of acquisition and medical data privacy regulations. We propose a federated learning (FL) based solution in which we take advantage of the MR data available at different institutions while preserving patients' privacy.
arXiv Detail & Related papers (2021-03-03T03:04:40Z)

This list is automatically generated from the titles and abstracts of the papers in this site.