Related papers: Brainformer: Mimic Human Visual Brain Functions to Machine Vision Models via fMRI

Brainformer: Mimic Human Visual Brain Functions to Machine Vision Models via fMRI

URL: http://arxiv.org/abs/2312.00236v3
Date: Wed, 20 Nov 2024 19:11:42 GMT
Title: Brainformer: Mimic Human Visual Brain Functions to Machine Vision Models via fMRI
Authors: Xuan-Bac Nguyen, Xin Li, Pawan Sinha, Samee U. Khan, Khoa Luu,
Abstract summary: We introduce a novel framework named Brainformer to analyze fMRI patterns in the human perception system. This work introduces a prospective approach to transferring knowledge from human perception to neural networks.
Score: 12.203617776046169
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Human perception plays a vital role in forming beliefs and understanding reality. A deeper understanding of brain functionality will lead to the development of novel deep neural networks. In this work, we introduce a novel framework named Brainformer, a straightforward yet effective Transformer-based framework, to analyze Functional Magnetic Resonance Imaging (fMRI) patterns in the human perception system from a machine-learning perspective. Specifically, we present the Multi-scale fMRI Transformer to explore brain activity patterns through fMRI signals. This architecture includes a simple yet efficient module for high-dimensional fMRI signal encoding and incorporates a novel embedding technique called 3D Voxels Embedding. Secondly, drawing inspiration from the functionality of the brain's Region of Interest, we introduce a novel loss function called Brain fMRI Guidance Loss. This loss function mimics brain activity patterns from these regions in the deep neural network using fMRI data. This work introduces a prospective approach to transferring knowledge from human perception to neural networks. Our experiments demonstrate that leveraging fMRI information allows the machine vision model to achieve results comparable to State-of-the-Art methods in various image recognition tasks.

Related papers

BrainFLORA: Uncovering Brain Concept Representation via Multimodal Neural Embeddings [10.966252877363512]
We introduce BrainFLORA, a unified framework for integrating cross-modal neuroimaging data to construct a shared neural representation.<n>Our approach leverages multimodal large language models (MLLMs) augmented with modality-specific adapters and task decoders, achieving state-of-the-art performance in joint-subject visual retrieval.<n>BrainFLORA offers novel implications for cognitive neuroscience and brain-computer interfaces (BCIs)
arXiv Detail & Related papers (2025-07-13T18:56:17Z)
Voxel-Level Brain States Prediction Using Swin Transformer [65.9194533414066]
We propose a novel architecture which employs a 4D Shifted Window (Swin) Transformer as encoder to efficiently learn-temporal information and a convolutional decoder to enable brain state prediction at the same spatial and temporal resolution as the input fMRI data.<n>Our model has shown high accuracy when predicting 7.2s resting-state brain activities based on the prior 23.04s fMRI time series.<n>This shows promising evidence that thetemporal organization of the human brain can be learned by a Swin Transformer model, at high resolution, which provides a potential for reducing fMRI scan time and the development of brain-computer interfaces
arXiv Detail & Related papers (2025-06-13T04:14:38Z)
A Survey on fMRI-based Brain Decoding for Reconstructing Multimodal Stimuli [26.07986165893441]
Decoding brain signals to reconstruct stimuli drives progress in AI, disease treatment, and brain-computer interfaces. Recent advancements in neuroimaging and image generation models have significantly improved fMRI-based decoding. This survey systematically reviews recent progress in fMRI-based brain decoding, focusing on stimulus reconstruction from passive brain signals.
arXiv Detail & Related papers (2025-03-20T09:23:07Z)
MindAligner: Explicit Brain Functional Alignment for Cross-Subject Visual Decoding from Limited fMRI Data [64.92867794764247]
MindAligner is a framework for cross-subject brain decoding from limited fMRI data. Brain Transfer Matrix (BTM) projects the brain signals of an arbitrary new subject to one of the known subjects. Brain Functional Alignment module is proposed to perform soft cross-subject brain alignment under different visual stimuli.
arXiv Detail & Related papers (2025-02-07T16:01:59Z)
Towards Neural Foundation Models for Vision: Aligning EEG, MEG, and fMRI Representations for Decoding, Encoding, and Modality Conversion [0.11249583407496218]
This paper presents a novel approach towards creating a foundational model for aligning neural data and visual stimuli across multimodal representationsof brain activity by leveraging contrastive learning. We used electroencephalography (EEG), magnetoencephalography (MEG), and functional magnetic resonance imaging (fMRI) data. Our framework's capabilities are demonstrated through three key experiments: decoding visual information from neural data, encoding images into neural representations, and converting between neural modalities.
arXiv Detail & Related papers (2024-11-14T12:27:27Z)
Teaching CORnet Human fMRI Representations for Enhanced Model-Brain Alignment [2.035627332992055]
Functional magnetic resonance imaging (fMRI) as a widely used technique in cognitive neuroscience can record neural activation in the human visual cortex during the process of visual perception. This study proposed ReAlnet-fMRI, a model based on the SOTA vision model CORnet but optimized using human fMRI data through a multi-layer encoding-based alignment framework. The fMRI-optimized ReAlnet-fMRI exhibited higher similarity to the human brain than both CORnet and the control model in within-and across-subject as well as within- and across-modality model-brain (fMRI and EEG
arXiv Detail & Related papers (2024-07-15T03:31:42Z)
MindFormer: Semantic Alignment of Multi-Subject fMRI for Brain Decoding [50.55024115943266]
We introduce a novel semantic alignment method of multi-subject fMRI signals using so-called MindFormer. This model is specifically designed to generate fMRI-conditioned feature vectors that can be used for conditioning Stable Diffusion model for fMRI- to-image generation or large language model (LLM) for fMRI-to-text generation. Our experimental results demonstrate that MindFormer generates semantically consistent images and text across different subjects.
arXiv Detail & Related papers (2024-05-28T00:36:25Z)
Brain3D: Generating 3D Objects from fMRI [76.41771117405973]
We design a novel 3D object representation learning method, Brain3D, that takes as input the fMRI data of a subject. We show that our model captures the distinct functionalities of each region of human vision system. Preliminary evaluations indicate that Brain3D can successfully identify the disordered brain regions in simulated scenarios.
arXiv Detail & Related papers (2024-05-24T06:06:11Z)
MindBridge: A Cross-Subject Brain Decoding Framework [60.58552697067837]
Brain decoding aims to reconstruct stimuli from acquired brain signals. Currently, brain decoding is confined to a per-subject-per-model paradigm. We present MindBridge, that achieves cross-subject brain decoding by employing only one model.
arXiv Detail & Related papers (2024-04-11T15:46:42Z)
fMRI-PTE: A Large-scale fMRI Pretrained Transformer Encoder for Multi-Subject Brain Activity Decoding [54.17776744076334]
We propose fMRI-PTE, an innovative auto-encoder approach for fMRI pre-training. Our approach involves transforming fMRI signals into unified 2D representations, ensuring consistency in dimensions and preserving brain activity patterns. Our contributions encompass introducing fMRI-PTE, innovative data transformation, efficient training, a novel learning strategy, and the universal applicability of our approach.
arXiv Detail & Related papers (2023-11-01T07:24:22Z)
Unidirectional brain-computer interface: Artificial neural network encoding natural images to fMRI response in the visual cortex [12.1427193917406]
We propose an artificial neural network dubbed VISION to mimic the human brain and show how it can foster neuroscientific inquiries. VISION successfully predicts human hemodynamic responses as fMRI voxel values to visual inputs with an accuracy exceeding state-of-the-art performance by 45%.
arXiv Detail & Related papers (2023-09-26T15:38:26Z)
Contrast, Attend and Diffuse to Decode High-Resolution Images from Brain Activities [31.448924808940284]
We introduce a two-phase fMRI representation learning framework. The first phase pre-trains an fMRI feature learner with a proposed Double-contrastive Mask Auto-encoder to learn denoised representations. The second phase tunes the feature learner to attend to neural activation patterns most informative for visual reconstruction with guidance from an image auto-encoder.
arXiv Detail & Related papers (2023-05-26T19:16:23Z)
Brain Captioning: Decoding human brain activity into images and text [1.5486926490986461]
We present an innovative method for decoding brain activity into meaningful images and captions. Our approach takes advantage of cutting-edge image captioning models and incorporates a unique image reconstruction pipeline. We evaluate our methods using quantitative metrics for both generated captions and images.
arXiv Detail & Related papers (2023-05-19T09:57:19Z)
Joint fMRI Decoding and Encoding with Latent Embedding Alignment [77.66508125297754]
We introduce a unified framework that addresses both fMRI decoding and encoding. Our model concurrently recovers visual stimuli from fMRI signals and predicts brain activity from images within a unified framework.
arXiv Detail & Related papers (2023-03-26T14:14:58Z)
Functional2Structural: Cross-Modality Brain Networks Representation Learning [55.24969686433101]
Graph mining on brain networks may facilitate the discovery of novel biomarkers for clinical phenotypes and neurodegenerative diseases. We propose a novel graph learning framework, known as Deep Signed Brain Networks (DSBN), with a signed graph encoder. We validate our framework on clinical phenotype and neurodegenerative disease prediction tasks using two independent, publicly available datasets.
arXiv Detail & Related papers (2022-05-06T03:45:36Z)

This list is automatically generated from the titles and abstracts of the papers in this site.