Optimized two-stage AI-based Neural Decoding for Enhanced Visual Stimulus Reconstruction from fMRI Data
- URL: http://arxiv.org/abs/2412.13237v1
- Date: Tue, 17 Dec 2024 16:42:55 GMT
- Title: Optimized two-stage AI-based Neural Decoding for Enhanced Visual Stimulus Reconstruction from fMRI Data
- Authors: Lorenzo Veronese, Andrea Moglia, Luca Mainardi, Pietro Cerveri,
- Abstract summary: This work proposes a non-linear deep network to improve fMRI latent space representation, optimizing the dimensionality alike.
Experiments on the Natural Scenes dataset showed that the proposed architecture improved the structural similarity of the reconstructed image by about 2% with respect to the state-of-the-art model.
The noise sensitivity analysis of the LDM showed that the role of the first stage was fundamental to predict the stimulus featuring high structural similarity.
- Score: 2.0851013563386247
- License:
- Abstract: AI-based neural decoding reconstructs visual perception by leveraging generative models to map brain activity, measured through functional MRI (fMRI), into latent hierarchical representations. Traditionally, ridge linear models transform fMRI into a latent space, which is then decoded using latent diffusion models (LDM) via a pre-trained variational autoencoder (VAE). Due to the complexity and noisiness of fMRI data, newer approaches split the reconstruction into two sequential steps, the first one providing a rough visual approximation, the second on improving the stimulus prediction via LDM endowed by CLIP embeddings. This work proposes a non-linear deep network to improve fMRI latent space representation, optimizing the dimensionality alike. Experiments on the Natural Scenes Dataset showed that the proposed architecture improved the structural similarity of the reconstructed image by about 2\% with respect to the state-of-the-art model, based on ridge linear transform. The reconstructed image's semantics improved by about 4\%, measured by perceptual similarity, with respect to the state-of-the-art. The noise sensitivity analysis of the LDM showed that the role of the first stage was fundamental to predict the stimulus featuring high structural similarity. Conversely, providing a large noise stimulus affected less the semantics of the predicted stimulus, while the structural similarity between the ground truth and predicted stimulus was very poor. The findings underscore the importance of leveraging non-linear relationships between BOLD signal and the latent representation and two-stage generative AI for optimizing the fidelity of reconstructed visual stimuli from noisy fMRI data.
Related papers
- ContextMRI: Enhancing Compressed Sensing MRI through Metadata Conditioning [51.26601171361753]
We propose ContextMRI, a text-conditioned diffusion model for MRI that integrates granular metadata into the reconstruction process.
We show that increasing the fidelity of metadata, ranging from slice location and contrast to patient age, sex, and pathology, systematically boosts reconstruction performance.
arXiv Detail & Related papers (2025-01-08T05:15:43Z) - Deep Cardiac MRI Reconstruction with ADMM [7.694990352622926]
We present a deep learning (DL)-based method for accelerated cine and multi-contrast reconstruction in the context of cardiac imaging.
Our method optimize in both the image and k-space domains, allowing for high reconstruction fidelity.
arXiv Detail & Related papers (2023-10-10T13:46:11Z) - Fill the K-Space and Refine the Image: Prompting for Dynamic and
Multi-Contrast MRI Reconstruction [31.404228406642194]
The key to dynamic or multi-contrast magnetic resonance imaging (MRI) reconstruction lies in exploring inter-frame or inter-contrast information.
We propose a two-stage MRI reconstruction pipeline to address these limitations.
Our proposed method significantly outperforms previous state-of-the-art accelerated MRI reconstruction methods.
arXiv Detail & Related papers (2023-09-25T02:51:00Z) - MindDiffuser: Controlled Image Reconstruction from Human Brain Activity
with Semantic and Structural Diffusion [7.597218661195779]
We propose a two-stage image reconstruction model called MindDiffuser.
In Stage 1, the VQ-VAE latent representations and the CLIP text embeddings decoded from fMRI are put into Stable Diffusion.
In Stage 2, we utilize the CLIP visual feature decoded from fMRI as supervisory information, and continually adjust the two feature vectors decoded in Stage 1 through backpagation to align the structural information.
arXiv Detail & Related papers (2023-08-08T13:28:34Z) - Joint fMRI Decoding and Encoding with Latent Embedding Alignment [77.66508125297754]
We introduce a unified framework that addresses both fMRI decoding and encoding.
Our model concurrently recovers visual stimuli from fMRI signals and predicts brain activity from images within a unified framework.
arXiv Detail & Related papers (2023-03-26T14:14:58Z) - Natural scene reconstruction from fMRI signals using generative latent
diffusion [1.90365714903665]
We present a two-stage scene reconstruction framework called Brain-Diffuser''
In the first stage, we reconstruct images that capture low-level properties and overall layout using a VDVAE (Very Deep Vari Autoencoder) model.
In the second stage, we use the image-to-image framework of a latent diffusion model conditioned on predicted multimodal (text and visual) features.
arXiv Detail & Related papers (2023-03-09T15:24:26Z) - Optimization-Based Deep learning methods for Magnetic Resonance Imaging
Reconstruction and Synthesis [0.0]
This dissertation aims to provide advanced nonsmooth variational models (Magnetic Resonance Image) MRI reconstruction, efficient learnable image reconstruction algorithms, and deep learning methods for MRI reconstruction and synthesis.
The first part introduces a novel based deep neural network whose architecture is inspired by proximal gradient descent for a variational model.
The second part is a substantial extension of the preliminary work in the first part by solving the calibration-free fast pMRI reconstruction problem in a discrete-time optimal framework.
The third part aims at developing a generalizable Magnetic Resonance Imaging (MRI) reconstruction method in the metalearning framework.
arXiv Detail & Related papers (2023-03-02T18:59:44Z) - Model-Guided Multi-Contrast Deep Unfolding Network for MRI
Super-resolution Reconstruction [68.80715727288514]
We show how to unfold an iterative MGDUN algorithm into a novel model-guided deep unfolding network by taking the MRI observation matrix.
In this paper, we propose a novel Model-Guided interpretable Deep Unfolding Network (MGDUN) for medical image SR reconstruction.
arXiv Detail & Related papers (2022-09-15T03:58:30Z) - PUERT: Probabilistic Under-sampling and Explicable Reconstruction
Network for CS-MRI [47.24613772568027]
Compressed Sensing MRI aims at reconstructing de-aliased images from sub-Nyquist sampling k-space data to accelerate MR Imaging.
We propose a novel end-to-end Probabilistic Under-sampling and Explicable Reconstruction neTwork, dubbed PUERT, to jointly optimize the sampling pattern and the reconstruction network.
Experiments on two widely used MRI datasets demonstrate that our proposed PUERT achieves state-of-the-art results in terms of both quantitative metrics and visual quality.
arXiv Detail & Related papers (2022-04-24T04:23:57Z) - A Long Short-term Memory Based Recurrent Neural Network for
Interventional MRI Reconstruction [50.1787181309337]
We propose a convolutional long short-term memory (Conv-LSTM) based recurrent neural network (RNN), or ConvLR, to reconstruct interventional images with golden-angle radial sampling.
The proposed algorithm has the potential to achieve real-time i-MRI for DBS and can be used for general purpose MR-guided intervention.
arXiv Detail & Related papers (2022-03-28T14:03:45Z) - Data-driven generation of plausible tissue geometries for realistic
photoacoustic image synthesis [53.65837038435433]
Photoacoustic tomography (PAT) has the potential to recover morphological and functional tissue properties.
We propose a novel approach to PAT data simulation, which we refer to as "learning to simulate"
We leverage the concept of Generative Adversarial Networks (GANs) trained on semantically annotated medical imaging data to generate plausible tissue geometries.
arXiv Detail & Related papers (2021-03-29T11:30:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.