Related papers: DreamCatcher: Revealing the Language of the Brain with fMRI using GPT Embedding

DreamCatcher: Revealing the Language of the Brain with fMRI using GPT Embedding

URL: http://arxiv.org/abs/2306.10082v1
Date: Fri, 16 Jun 2023 07:55:20 GMT
Title: DreamCatcher: Revealing the Language of the Brain with fMRI using GPT Embedding
Authors: Subhrasankar Chatterjee and Debasis Samanta
Abstract summary: We propose fMRI captioning, where captions are generated based on fMRI data to gain insight into visual perception. DreamCatcher consists of the Representation Space (RSE) and the RevEmbedding Decoder, which transform fMRI into a latent space vectors generate captions. fMRI-based captioning has diverse applications, including understanding neural mechanisms, Human-Computer Interaction, and enhancing learning and training processes.
Score: 6.497816402045099
License: http://creativecommons.org/licenses/by/4.0/
Abstract: The human brain possesses remarkable abilities in visual processing, including image recognition and scene summarization. Efforts have been made to understand the cognitive capacities of the visual brain, but a comprehensive understanding of the underlying mechanisms still needs to be discovered. Advancements in brain decoding techniques have led to sophisticated approaches like fMRI-to-Image reconstruction, which has implications for cognitive neuroscience and medical imaging. However, challenges persist in fMRI-to-image reconstruction, such as incorporating global context and contextual information. In this article, we propose fMRI captioning, where captions are generated based on fMRI data to gain insight into the neural correlates of visual perception. This research presents DreamCatcher, a novel framework for fMRI captioning. DreamCatcher consists of the Representation Space Encoder (RSE) and the RevEmbedding Decoder, which transform fMRI vectors into a latent space and generate captions, respectively. We evaluated the framework through visualization, dataset training, and testing on subjects, demonstrating strong performance. fMRI-based captioning has diverse applications, including understanding neural mechanisms, Human-Computer Interaction, and enhancing learning and training processes.

Related papers

MindAligner: Explicit Brain Functional Alignment for Cross-Subject Visual Decoding from Limited fMRI Data [64.92867794764247]
MindAligner is a framework for cross-subject brain decoding from limited fMRI data. Brain Transfer Matrix (BTM) projects the brain signals of an arbitrary new subject to one of the known subjects. Brain Functional Alignment module is proposed to perform soft cross-subject brain alignment under different visual stimuli.
arXiv Detail & Related papers (2025-02-07T16:01:59Z)
BrainChat: Decoding Semantic Information from fMRI using Vision-language Pretrained Models [0.0]
This paper proposes BrainChat, a generative framework aimed at rapidly accomplishing semantic information decoding tasks from brain activity. BrainChat implements fMRI question answering and fMRI captioning. BrainChat is highly flexible and can achieve high performance without image data, making it better suited for real-world scenarios with limited data.
arXiv Detail & Related papers (2024-06-10T12:06:15Z)
MindFormer: Semantic Alignment of Multi-Subject fMRI for Brain Decoding [50.55024115943266]
We introduce a novel semantic alignment method of multi-subject fMRI signals using so-called MindFormer. This model is specifically designed to generate fMRI-conditioned feature vectors that can be used for conditioning Stable Diffusion model for fMRI- to-image generation or large language model (LLM) for fMRI-to-text generation. Our experimental results demonstrate that MindFormer generates semantically consistent images and text across different subjects.
arXiv Detail & Related papers (2024-05-28T00:36:25Z)
MindBridge: A Cross-Subject Brain Decoding Framework [60.58552697067837]
Brain decoding aims to reconstruct stimuli from acquired brain signals. Currently, brain decoding is confined to a per-subject-per-model paradigm. We present MindBridge, that achieves cross-subject brain decoding by employing only one model.
arXiv Detail & Related papers (2024-04-11T15:46:42Z)
Brainformer: Mimic Human Visual Brain Functions to Machine Vision Models via fMRI [12.203617776046169]
We introduce a novel framework named Brainformer to analyze fMRI patterns in the human perception system. This work introduces a prospective approach to transferring knowledge from human perception to neural networks.
arXiv Detail & Related papers (2023-11-30T22:39:23Z)
fMRI-PTE: A Large-scale fMRI Pretrained Transformer Encoder for Multi-Subject Brain Activity Decoding [54.17776744076334]
We propose fMRI-PTE, an innovative auto-encoder approach for fMRI pre-training. Our approach involves transforming fMRI signals into unified 2D representations, ensuring consistency in dimensions and preserving brain activity patterns. Our contributions encompass introducing fMRI-PTE, innovative data transformation, efficient training, a novel learning strategy, and the universal applicability of our approach.
arXiv Detail & Related papers (2023-11-01T07:24:22Z)
UniBrain: Unify Image Reconstruction and Captioning All in One Diffusion Model from Human Brain Activity [2.666777614876322]
We propose UniBrain: Unify Image Reconstruction and Captioning All in One Diffusion Model from Human Brain Activity. We transform fMRI voxels into text and image latent for low-level information to generate realistic captions and images. UniBrain outperforms current methods both qualitatively and quantitatively in terms of image reconstruction and reports image captioning results for the first time on the Natural Scenes dataset.
arXiv Detail & Related papers (2023-08-14T19:49:29Z)
Brain Captioning: Decoding human brain activity into images and text [1.5486926490986461]
We present an innovative method for decoding brain activity into meaningful images and captions. Our approach takes advantage of cutting-edge image captioning models and incorporates a unique image reconstruction pipeline. We evaluate our methods using quantitative metrics for both generated captions and images.
arXiv Detail & Related papers (2023-05-19T09:57:19Z)
Controllable Mind Visual Diffusion Model [58.83896307930354]
Brain signal visualization has emerged as an active research area, serving as a critical interface between the human visual system and computer vision models. We propose a novel approach, referred to as Controllable Mind Visual Model Diffusion (CMVDM) CMVDM extracts semantic and silhouette information from fMRI data using attribute alignment and assistant networks. We then leverage a control model to fully exploit the extracted information for image synthesis, resulting in generated images that closely resemble the visual stimuli in terms of semantics and silhouette.
arXiv Detail & Related papers (2023-05-17T11:36:40Z)
Joint fMRI Decoding and Encoding with Latent Embedding Alignment [77.66508125297754]
We introduce a unified framework that addresses both fMRI decoding and encoding. Our model concurrently recovers visual stimuli from fMRI signals and predicts brain activity from images within a unified framework.
arXiv Detail & Related papers (2023-03-26T14:14:58Z)
BrainCLIP: Bridging Brain and Visual-Linguistic Representation Via CLIP for Generic Natural Visual Stimulus Decoding [51.911473457195555]
BrainCLIP is a task-agnostic fMRI-based brain decoding model. It bridges the modality gap between brain activity, image, and text. BrainCLIP can reconstruct visual stimuli with high semantic fidelity.
arXiv Detail & Related papers (2023-02-25T03:28:54Z)
Mind Reader: Reconstructing complex images from brain activities [16.78619734818198]
We focus on reconstructing the complex image stimuli from fMRI (functional magnetic resonance imaging) signals. Unlike previous works that reconstruct images with single objects or simple shapes, our work aims to reconstruct image stimuli rich in semantics. We find that incorporating an additional text modality is beneficial for the reconstruction problem compared to directly translating brain signals to images.
arXiv Detail & Related papers (2022-09-30T06:32:46Z)

This list is automatically generated from the titles and abstracts of the papers in this site.