Related papers: Lite-Mind: Towards Efficient and Robust Brain Representation Network

Lite-Mind: Towards Efficient and Robust Brain Representation Network

URL: http://arxiv.org/abs/2312.03781v3
Date: Fri, 19 Apr 2024 05:45:25 GMT
Title: Lite-Mind: Towards Efficient and Robust Brain Representation Network
Authors: Zixuan Gong, Qi Zhang, Guangyin Bao, Lei Zhu, Yu Zhang, Ke Liu, Liang Hu, Duoqian Miao,
Abstract summary: Lite-Mind is a lightweight, efficient, and robust brain representation learning paradigm based on Discrete Frequency Transform (DFT) We show Lite-Mind achieves an impressive 94.6% fMRI-to-image retrieval accuracy on the NSD dataset for Subject 1, with 98.7% fewer parameters than MindEye. Lite-Mind is also proven to be able to be migrated to smaller fMRI datasets and establishes a new state-of-the-art for zero-shot classification on the GOD dataset.
Score: 23.13231031281597
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The limited data availability and the low signal-to-noise ratio of fMRI signals lead to the challenging task of fMRI-to-image retrieval. State-of-the-art MindEye remarkably improves fMRI-to-image retrieval performance by leveraging a large model, i.e., a 996M MLP Backbone per subject, to align fMRI embeddings to the final hidden layer of CLIP's Vision Transformer (ViT). However, significant individual variations exist among subjects, even under identical experimental setups, mandating the training of large subject-specific models. The substantial parameters pose significant challenges in deploying fMRI decoding on practical devices. To this end, we propose Lite-Mind, a lightweight, efficient, and robust brain representation learning paradigm based on Discrete Fourier Transform (DFT), which efficiently aligns fMRI voxels to fine-grained information of CLIP. We elaborately design a DFT backbone with Spectrum Compression and Frequency Projector modules to learn informative and robust voxel embeddings. Our experiments demonstrate that Lite-Mind achieves an impressive 94.6% fMRI-to-image retrieval accuracy on the NSD dataset for Subject 1, with 98.7% fewer parameters than MindEye. Lite-Mind is also proven to be able to be migrated to smaller fMRI datasets and establishes a new state-of-the-art for zero-shot classification on the GOD dataset.

Related papers

GM-LDM: Latent Diffusion Model for Brain Biomarker Identification through Functional Data-Driven Gray Matter Synthesis [4.336463644962463]
This study introduces GM-LDM, a novel framework that leverages the latent diffusion model (LDM) to enhance the efficiency and precision of MRI generation tasks.<n>GM-LDM integrates a 3D autoencoder, pre-trained on the large-scale ABCD MRI dataset, achieving statistical consistency through KL divergence loss.<n>The framework flexibly incorporates conditional data, such as functional network connectivity (FNC) data, enabling personalized brain imaging, biomarker identification, and functional-to-structural information translation for brain diseases like schizophrenia.
arXiv Detail & Related papers (2025-06-15T04:51:31Z)
MRI Image Generation Based on Text Prompts [0.0]
This study explores the use of text-prompted MRI image generation with the Stable Diffusion (SD) model to address challenges in acquiring real MRI datasets.<n>The SD model, pre-trained on natural images, was fine-tuned using the 3T fastMRI dataset and the 0.3T M4Raw dataset.<n>The performance of the fine-tuned model was evaluated using quantitative metrics, including Fr'echet Inception Distance (FID) and Multi-Scale Structural Similarity (MS-SSIM)
arXiv Detail & Related papers (2025-05-23T03:01:22Z)
ContextMRI: Enhancing Compressed Sensing MRI through Metadata Conditioning [51.26601171361753]
We propose ContextMRI, a text-conditioned diffusion model for MRI that integrates granular metadata into the reconstruction process. We show that increasing the fidelity of metadata, ranging from slice location and contrast to patient age, sex, and pathology, systematically boosts reconstruction performance.
arXiv Detail & Related papers (2025-01-08T05:15:43Z)
Large-Scale Data-Free Knowledge Distillation for ImageNet via Multi-Resolution Data Generation [53.95204595640208]
Data-Free Knowledge Distillation (DFKD) is an advanced technique that enables knowledge transfer from a teacher model to a student model without relying on original training data. Previous approaches have generated synthetic images at high resolutions without leveraging information from real images. MUSE generates images at lower resolutions while using Class Activation Maps (CAMs) to ensure that the generated images retain critical, class-specific features.
arXiv Detail & Related papers (2024-11-26T02:23:31Z)
Residual Vision Transformer (ResViT) Based Self-Supervised Learning Model for Brain Tumor Classification [0.08192907805418585]
Self-supervised learning models provide data-efficient and remarkable solutions to limited dataset problems. This paper introduces a generative SSL model for brain tumor classification in two stages. The proposed model attains the highest accuracy, achieving 90.56% on the BraTs dataset with T1 sequence, 98.53% on the Figshare, and 98.47% on the Kaggle brain tumor datasets.
arXiv Detail & Related papers (2024-11-19T21:42:57Z)
LLM4Brain: Training a Large Language Model for Brain Video Understanding [9.294352205183726]
We introduce an LLM-based approach for reconstructing visual-semantic information from fMRI signals elicited by video stimuli. We employ fine-tuning techniques on an fMRI encoder equipped with adaptors to transform brain responses into latent representations aligned with the video stimuli. In particular, we integrate self-supervised domain adaptation methods to enhance the alignment between visual-semantic information and brain responses.
arXiv Detail & Related papers (2024-09-26T15:57:08Z)
MindFormer: Semantic Alignment of Multi-Subject fMRI for Brain Decoding [50.55024115943266]
We introduce a novel semantic alignment method of multi-subject fMRI signals using so-called MindFormer. This model is specifically designed to generate fMRI-conditioned feature vectors that can be used for conditioning Stable Diffusion model for fMRI- to-image generation or large language model (LLM) for fMRI-to-text generation. Our experimental results demonstrate that MindFormer generates semantically consistent images and text across different subjects.
arXiv Detail & Related papers (2024-05-28T00:36:25Z)
NeuroPictor: Refining fMRI-to-Image Reconstruction via Multi-individual Pretraining and Multi-level Modulation [55.51412454263856]
This paper proposes to directly modulate the generation process of diffusion models using fMRI signals. By training with about 67,000 fMRI-image pairs from various individuals, our model enjoys superior fMRI-to-image decoding capacity.
arXiv Detail & Related papers (2024-03-27T02:42:52Z)
Transferring Ultrahigh-Field Representations for Intensity-Guided Brain Segmentation of Low-Field Magnetic Resonance Imaging [51.92395928517429]
The use of 7T MRI is limited by its high cost and lower accessibility compared to low-field (LF) MRI. This study proposes a deep-learning framework that fuses the input LF magnetic resonance feature representations with the inferred 7T-like feature representations for brain image segmentation tasks.
arXiv Detail & Related papers (2024-02-13T12:21:06Z)
fMRI-PTE: A Large-scale fMRI Pretrained Transformer Encoder for Multi-Subject Brain Activity Decoding [54.17776744076334]
We propose fMRI-PTE, an innovative auto-encoder approach for fMRI pre-training. Our approach involves transforming fMRI signals into unified 2D representations, ensuring consistency in dimensions and preserving brain activity patterns. Our contributions encompass introducing fMRI-PTE, innovative data transformation, efficient training, a novel learning strategy, and the universal applicability of our approach.
arXiv Detail & Related papers (2023-11-01T07:24:22Z)
Learning Sequential Information in Task-based fMRI for Synthetic Data Augmentation [10.629487323161323]
We propose an approach for generating synthetic fMRI sequences that can be used to create augmented training datasets in downstream learning. The synthetic images are evaluated from multiple perspectives including visualizations and an autism spectrum disorder (ASD) classification task.
arXiv Detail & Related papers (2023-08-29T18:36:21Z)
BrainCLIP: Bridging Brain and Visual-Linguistic Representation Via CLIP for Generic Natural Visual Stimulus Decoding [51.911473457195555]
BrainCLIP is a task-agnostic fMRI-based brain decoding model. It bridges the modality gap between brain activity, image, and text. BrainCLIP can reconstruct visual stimuli with high semantic fidelity.
arXiv Detail & Related papers (2023-02-25T03:28:54Z)
MouseGAN++: Unsupervised Disentanglement and Contrastive Representation for Multiple MRI Modalities Synthesis and Structural Segmentation of Mouse Brain [4.733517098000804]
multimodal mouse brain MRI data is often lacking, making automatic segmentation of mouse brain fine structure a very challenging task. We propose a novel disentangled and contrastive GAN-based framework, named MouseGAN++, to synthesize multiple MR modalities from single ones in a structure-preserving manner. Using the subsequently learned modality-invariant information as well as the modality-translated images, MouseGAN++ can segment fine brain structures with averaged dice coefficients of 90.0% (T2w) and 87.9% (T1w)
arXiv Detail & Related papers (2022-12-04T14:19:49Z)

This list is automatically generated from the titles and abstracts of the papers in this site.