Related papers: MinD-3D++: Advancing fMRI-Based 3D Reconstruction with High-Quality Textured Mesh Generation and a Comprehensive Dataset

MinD-3D++: Advancing fMRI-Based 3D Reconstruction with High-Quality Textured Mesh Generation and a Comprehensive Dataset

URL: http://arxiv.org/abs/2409.11315v2
Date: Fri, 10 Jan 2025 19:36:30 GMT
Title: MinD-3D++: Advancing fMRI-Based 3D Reconstruction with High-Quality Textured Mesh Generation and a Comprehensive Dataset
Authors: Jianxiong Gao, Yanwei Fu, Yuqian Fu, Yun Wang, Xuelin Qian, Jianfeng Feng,
Abstract summary: Reconstructing 3D visuals from functional Magnetic Resonance Imaging (fMRI) data is of significant interest to cognitive neuroscience and computer vision.<n>We present the fMRI-3D dataset, which includes data from 15 participants and showcases a total of 4,768 3D objects.<n>We propose MinD-3D++, a novel framework for decoding textured 3D visual information from fMRI signals.
Score: 50.534007259536715
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Reconstructing 3D visuals from functional Magnetic Resonance Imaging (fMRI) data, introduced as Recon3DMind, is of significant interest to both cognitive neuroscience and computer vision. To advance this task, we present the fMRI-3D dataset, which includes data from 15 participants and showcases a total of 4,768 3D objects. The dataset consists of two components: fMRI-Shape, previously introduced and available at https://huggingface.co/datasets/Fudan-fMRI/fMRI-Shape, and fMRI-Objaverse, proposed in this paper and available at https://huggingface.co/datasets/Fudan-fMRI/fMRI-Objaverse. fMRI-Objaverse includes data from 5 subjects, 4 of whom are also part of the core set in fMRI-Shape. Each subject views 3,142 3D objects across 117 categories, all accompanied by text captions. This significantly enhances the diversity and potential applications of the dataset. Moreover, we propose MinD-3D++, a novel framework for decoding textured 3D visual information from fMRI signals. The framework evaluates the feasibility of not only reconstructing 3D objects from the human mind but also generating, for the first time, 3D textured meshes with detailed textures from fMRI data. We establish new benchmarks by designing metrics at the semantic, structural, and textured levels to evaluate model performance. Furthermore, we assess the model's effectiveness in out-of-distribution settings and analyze the attribution of the proposed 3D pari fMRI dataset in visual regions of interest (ROIs) in fMRI signals. Our experiments demonstrate that MinD-3D++ not only reconstructs 3D objects with high semantic and spatial accuracy but also provides deeper insights into how the human brain processes 3D visual information. Project page: https://jianxgao.github.io/MinD-3D.

Related papers

MRI Reconstruction with Regularized 3D Diffusion Model (R3DM) [2.842800539489865]
We propose a 3D MRI reconstruction method that leverages a regularized 3D diffusion model combined with optimization method. By incorporating diffusion based priors, our method improves image quality, reduces noise, and enhances the overall fidelity of 3D MRI reconstructions.
arXiv Detail & Related papers (2024-12-25T00:55:05Z)
Feat2GS: Probing Visual Foundation Models with Gaussian Splatting [53.86049874165888]
Visual foundation models (VFMs) are trained on extensive datasets but often limited to 2D images. We introduce Feat2GS, which readout 3D Gaussians attributes from VFM features extracted from unposed images. We conduct extensive experiments to probe the 3D awareness of several VFMs, and investigate the ingredients that lead to a 3D aware VFM.
arXiv Detail & Related papers (2024-12-12T18:59:28Z)
Neuro-3D: Towards 3D Visual Decoding from EEG Signals [49.502364730056044]
We introduce a new neuroscience task: decoding 3D visual perception from EEG signals. We first present EEG-3D, a dataset featuring multimodal analysis data and EEG recordings from 12 subjects viewing 72 categories of 3D objects rendered in both videos and images. We propose Neuro-3D, a 3D visual decoding framework based on EEG signals.
arXiv Detail & Related papers (2024-11-19T05:52:17Z)
Looking through the mind's eye via multimodal encoder-decoder networks [7.949204393111349]
We explore the decoding of mental imagery from subjects using their fMRI measurements. We create a mapping between a subject's fMRI signals elicited by the videos the subjects watched and visual imagery. We enhance an existing fMRI dataset, initially consisting of data from five subjects, by including recordings from three more subjects gathered by our team.
arXiv Detail & Related papers (2024-09-27T20:48:03Z)
Implicit-Zoo: A Large-Scale Dataset of Neural Implicit Functions for 2D Images and 3D Scenes [65.22070581594426]
"Implicit-Zoo" is a large-scale dataset requiring thousands of GPU training days to facilitate research and development in this field. We showcase two immediate benefits as it enables to: (1) learn token locations for transformer models; (2) directly regress 3D cameras poses of 2D images with respect to NeRF models. This in turn leads to an improved performance in all three task of image classification, semantic segmentation, and 3D pose regression, thereby unlocking new avenues for research.
arXiv Detail & Related papers (2024-06-25T10:20:44Z)
Brain3D: Generating 3D Objects from fMRI [76.41771117405973]
We design a novel 3D object representation learning method, Brain3D, that takes as input the fMRI data of a subject. We show that our model captures the distinct functionalities of each region of human vision system. Preliminary evaluations indicate that Brain3D can successfully identify the disordered brain regions in simulated scenarios.
arXiv Detail & Related papers (2024-05-24T06:06:11Z)
VFMM3D: Releasing the Potential of Image by Vision Foundation Model for Monocular 3D Object Detection [80.62052650370416]
monocular 3D object detection holds significant importance across various applications, including autonomous driving and robotics. In this paper, we present VFMM3D, an innovative framework that leverages the capabilities of Vision Foundation Models (VFMs) to accurately transform single-view images into LiDAR point cloud representations.
arXiv Detail & Related papers (2024-04-15T03:12:12Z)
MinD-3D: Reconstruct High-quality 3D objects in Human Brain [50.534007259536715]
Recon3DMind is an innovative task aimed at reconstructing 3D visuals from Functional Magnetic Resonance Imaging (fMRI) signals. We present the fMRI-Shape dataset, which includes data from 14 participants and features 360-degree videos of 3D objects. We propose MinD-3D, a novel and effective three-stage framework specifically designed to decode the brain's 3D visual information from fMRI signals.
arXiv Detail & Related papers (2023-12-12T18:21:36Z)
MonoNeRD: NeRF-like Representations for Monocular 3D Object Detection [31.58403386994297]
We propose MonoNeRD, a novel detection framework that can infer dense 3D geometry and occupancy. Specifically, we model scenes with Signed Distance Functions (SDF), facilitating the production of dense 3D representations. To the best of our knowledge, this work is the first to introduce volume rendering for M3D, and demonstrates the potential of implicit reconstruction for image-based 3D perception.
arXiv Detail & Related papers (2023-08-18T09:39:52Z)
A Hybrid 3DCNN and 3DC-LSTM based model for 4D Spatio-temporal fMRI data: An ABIDE Autism Classification study [0.0]
We introduce an end-to-end algorithm capable of extracting features from full 4-D data using 3-D CNNs and 3-D Magnetical LSTMs. Our results show that the proposed model achieves state of the art results on single sites with F1-scores of 0.78 and 0.7 on NYU and UM sites, respectively.
arXiv Detail & Related papers (2020-02-14T11:52:00Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.