Related papers: Neural 3D decoding for human vision diagnosis

Neural 3D decoding for human vision diagnosis

URL: http://arxiv.org/abs/2405.15239v2
Date: Sun, 21 Jul 2024 14:28:44 GMT
Title: Neural 3D decoding for human vision diagnosis
Authors: Li Zhang, Yuankun Yang, Ziyang Xie, Zhiyuan Yuan, Jianfeng Feng, Xiatian Zhu, Yu-Gang Jiang,
Abstract summary: We show how AI can go beyond the current state of the art by advancing from 2D visuals to visually plausible and functionally more comprehensive 3D visuals decoded from brain signals. We design a novel 3D object representation learning method, Brain3D, that takes as input the fMRI data of a subject who was presented with a 2D image, and yields as output the corresponding 3D object visuals.
Score: 76.41771117405973
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Understanding the hidden mechanisms behind human's visual perception is a fundamental question in neuroscience. To that end, investigating into the neural responses of human mind activities, such as functional Magnetic Resonance Imaging (fMRI), has been a significant research vehicle. However, analyzing fMRI signals is challenging, costly, daunting, and demanding for professional training. Despite remarkable progress in artificial intelligence (AI) based fMRI analysis, existing solutions are limited and far away from being biologically meaningful and practically useful. In this context, we leap forward to demonstrate how AI can go beyond the current state of the art by advancing from 2D visuals to visually plausible and functionally more comprehensive 3D visuals decoded from brain signals, enabling automatic more sophisticated modeling of fMRI data. Innovationally, we reformulate the task of analyzing fMRI data as a conditional 3D object generation problem. We design a novel 3D object representation learning method, Brain3D, that takes as input the fMRI data of a subject who was presented with a 2D image, and yields as output the corresponding 3D object visuals. Importantly, we show that our AI agent captures the distinct functionalities of each region of human vision system as well as their intricate interplay relationships, aligning remarkably with the established discoveries of neuroscience. Non-expert diagnosis indicate that \ourmodel{} can successfully identify the disordered brain regions in simulated scenarios, such as V1, V2, V3, V4, and the medial temporal lobe (MTL) within the human visual system. We also present results in cross-modal 3D visual generation setting, showcasing the perception quality of our 3D generation.

Related papers

Voxel-Level Brain States Prediction Using Swin Transformer [65.9194533414066]
We propose a novel architecture which employs a 4D Shifted Window (Swin) Transformer as encoder to efficiently learn-temporal information and a convolutional decoder to enable brain state prediction at the same spatial and temporal resolution as the input fMRI data.<n>Our model has shown high accuracy when predicting 7.2s resting-state brain activities based on the prior 23.04s fMRI time series.<n>This shows promising evidence that thetemporal organization of the human brain can be learned by a Swin Transformer model, at high resolution, which provides a potential for reducing fMRI scan time and the development of brain-computer interfaces
arXiv Detail & Related papers (2025-06-13T04:14:38Z)
Neuro-3D: Towards 3D Visual Decoding from EEG Signals [49.502364730056044]
We introduce a new neuroscience task: decoding 3D visual perception from EEG signals. We first present EEG-3D, a dataset featuring multimodal analysis data and EEG recordings from 12 subjects viewing 72 categories of 3D objects rendered in both videos and images. We propose Neuro-3D, a 3D visual decoding framework based on EEG signals.
arXiv Detail & Related papers (2024-11-19T05:52:17Z)
fMRI-3D: A Comprehensive Dataset for Enhancing fMRI-based 3D Reconstruction [50.534007259536715]
We present the fMRI-3D dataset, which includes data from 15 participants and showcases a total of 4768 3D objects. We propose MinD-3D, a novel framework designed to decode 3D visual information from fMRI signals.
arXiv Detail & Related papers (2024-09-17T16:13:59Z)
BrainODE: Dynamic Brain Signal Analysis via Graph-Aided Neural Ordinary Differential Equations [67.79256149583108]
We propose a novel model called BrainODE to achieve continuous modeling of dynamic brain signals. By learning latent initial values and neural ODE functions from irregular time series, BrainODE effectively reconstructs brain signals at any time point.
arXiv Detail & Related papers (2024-04-30T10:53:30Z)
MinD-3D: Reconstruct High-quality 3D objects in Human Brain [50.534007259536715]
Recon3DMind is an innovative task aimed at reconstructing 3D visuals from Functional Magnetic Resonance Imaging (fMRI) signals. We present the fMRI-Shape dataset, which includes data from 14 participants and features 360-degree videos of 3D objects. We propose MinD-3D, a novel and effective three-stage framework specifically designed to decode the brain's 3D visual information from fMRI signals.
arXiv Detail & Related papers (2023-12-12T18:21:36Z)
Brainformer: Mimic Human Visual Brain Functions to Machine Vision Models via fMRI [12.203617776046169]
We introduce a novel framework named Brainformer to analyze fMRI patterns in the human perception system. This work introduces a prospective approach to transferring knowledge from human perception to neural networks.
arXiv Detail & Related papers (2023-11-30T22:39:23Z)
Contrast, Attend and Diffuse to Decode High-Resolution Images from Brain Activities [31.448924808940284]
We introduce a two-phase fMRI representation learning framework. The first phase pre-trains an fMRI feature learner with a proposed Double-contrastive Mask Auto-encoder to learn denoised representations. The second phase tunes the feature learner to attend to neural activation patterns most informative for visual reconstruction with guidance from an image auto-encoder.
arXiv Detail & Related papers (2023-05-26T19:16:23Z)
3D Neural Scene Representations for Visuomotor Control [78.79583457239836]
We learn models for dynamic 3D scenes purely from 2D visual observations. A dynamics model, constructed over the learned representation space, enables visuomotor control for challenging manipulation tasks.
arXiv Detail & Related papers (2021-07-08T17:49:37Z)
Medical Transformer: Universal Brain Encoder for 3D MRI Analysis [1.6287500717172143]
Existing 3D-based methods have transferred the pre-trained models to downstream tasks. They demand a massive amount of parameters to train the model for 3D medical imaging. We propose a novel transfer learning framework, called Medical Transformer, that effectively models 3D volumetric images in the form of a sequence of 2D image slices.
arXiv Detail & Related papers (2021-04-28T08:34:21Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.