Related papers: SLIM-Brain: A Data- and Training-Efficient Foundation Model for fMRI Data Analysis

SLIM-Brain: A Data- and Training-Efficient Foundation Model for fMRI Data Analysis

URL: http://arxiv.org/abs/2512.21881v1
Date: Fri, 26 Dec 2025 06:10:31 GMT
Title: SLIM-Brain: A Data- and Training-Efficient Foundation Model for fMRI Data Analysis
Authors: Mo Wang, Junfeng Xia, Wenhao Ye, Enyu Liu, Kaining Peng, Jianfeng Feng, Quanying Liu, Hongkai Wen,
Abstract summary: Foundation models are emerging as a powerful paradigm for fMRI analysis.<n>Current approaches face a dual bottleneck of data- and training-efficiency.<n>SLIM-Brain is a new atlas-free foundation model that simultaneously improves both data- and training-efficiency.
Score: 16.923310176195766
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Foundation models are emerging as a powerful paradigm for fMRI analysis, but current approaches face a dual bottleneck of data- and training-efficiency. Atlas-based methods aggregate voxel signals into fixed regions of interest, reducing data dimensionality but discarding fine-grained spatial details, and requiring extremely large cohorts to train effectively as general-purpose foundation models. Atlas-free methods, on the other hand, operate directly on voxel-level information - preserving spatial fidelity but are prohibitively memory- and compute-intensive, making large-scale pre-training infeasible. We introduce SLIM-Brain (Sample-efficient, Low-memory fMRI Foundation Model for Human Brain), a new atlas-free foundation model that simultaneously improves both data- and training-efficiency. SLIM-Brain adopts a two-stage adaptive design: (i) a lightweight temporal extractor captures global context across full sequences and ranks data windows by saliency, and (ii) a 4D hierarchical encoder (Hiera-JEPA) learns fine-grained voxel-level representations only from the top-$k$ selected windows, while deleting about 70% masked patches. Extensive experiments across seven public benchmarks show that SLIM-Brain establishes new state-of-the-art performance on diverse tasks, while requiring only 4 thousand pre-training sessions and approximately 30% of GPU memory comparing to traditional voxel-level methods.

Related papers

Data Selection for Fine-tuning Vision Language Models via Cross Modal Alignment Trajectories [45.580892667483674]
We propose the first principled method for data-efficient instruction tuning of Large Vision-Language Models (LVLMs)<n>XMAS clusters examples based on the trajectories of the top singular values of their attention matrices obtained from fine-tuning a small proxy LVLM.<n>XMAS can discard 50% of the LLaVA-665k dataset and 85% of the Vision-Flan dataset while preserving performance of LLaVA-1.5-7B on 10 downstream benchmarks.
arXiv Detail & Related papers (2025-10-01T20:47:29Z)
DCA: Graph-Guided Deep Embedding Clustering for Brain Atlases [11.027192223822832]
Deep Cluster Atlas (DCA) is a graph-guided deep embedding clustering framework for generating individualized, voxel-wise brain parcellations.<n>DCA combines a pretrained autoencoder with spatially regularized deep clustering to produce functionally coherent and spatially contiguous regions.<n>Across multiple datasets and scales, DCA outperforms state-of-the-art atlases, improving functional homogeneity by 98.8% and silhouette coefficient by 29%.
arXiv Detail & Related papers (2025-09-01T12:33:32Z)
SPaRFT: Self-Paced Reinforcement Fine-Tuning for Large Language Models [51.74498855100541]
Large language models (LLMs) have shown strong reasoning capabilities when fine-tuned with reinforcement learning (RL)<n>We propose textbfSPaRFT, a self-paced learning framework that enables efficient learning based on the capability of the model being trained.
arXiv Detail & Related papers (2025-08-07T03:50:48Z)
LifelongPR: Lifelong point cloud place recognition based on sample replay and prompt learning [15.464706470200337]
We propose LifelongPR, a novel continual learning framework for PCPR.<n>LifelongPR effectively extracts and fuses knowledge from sequential point cloud data.<n>Our method achieves 6.50% improvement in mIR@1, 7.96% improvement in mR@1, and an 8.95% reduction in F.
arXiv Detail & Related papers (2025-07-14T08:13:33Z)
Private Training & Data Generation by Clustering Embeddings [74.00687214400021]
Differential privacy (DP) provides a robust framework for protecting individual data.<n>We introduce a novel principled method for DP synthetic image embedding generation.<n> Empirically, a simple two-layer neural network trained on synthetically generated embeddings achieves state-of-the-art (SOTA) classification accuracy.
arXiv Detail & Related papers (2025-06-20T00:17:14Z)
Flexiffusion: Training-Free Segment-Wise Neural Architecture Search for Efficient Diffusion Models [50.260693393896716]
Diffusion models (DMs) are powerful generative models capable of producing high-fidelity images but constrained by high computational costs.<n>We propose Flexiffusion, a training-free NAS framework that jointly optimize generation schedules and model architectures without modifying pre-trained parameters.<n>Our work pioneers a resource-efficient paradigm for searching high-speed DMs without sacrificing quality.
arXiv Detail & Related papers (2025-06-03T06:02:50Z)
SHA256 at SemEval-2025 Task 4: Selective Amnesia -- Constrained Unlearning for Large Language Models via Knowledge Isolation [12.838593066237452]
Large language models (LLMs) memorize frequently sensitive information during training, posing risks when deploying publicly accessible models.<n>This paper presents our solution to SemEval-2025 Task 4 on targeted unlearning, which combines causal mediation analysis with layer-specific optimization.
arXiv Detail & Related papers (2025-04-17T15:05:40Z)
FINE: Factorizing Knowledge for Initialization of Variable-sized Diffusion Models [35.40065954148091]
FINE is a method based on the Learngene framework to initializing downstream networks leveraging pre-trained models. It decomposes pre-trained knowledge into the product of matrices (i.e., $U$, $Sigma$, and $V$), where $U$ and $V$ are shared across network blocks as learngenes'' It consistently outperforms direct pre-training, particularly for smaller models, achieving state-of-the-art results across variable model sizes.
arXiv Detail & Related papers (2024-09-28T08:57:17Z)
TokenUnify: Scaling Up Autoregressive Pretraining for Neuron Segmentation [65.65530016765615]
We propose a hierarchical predictive coding framework that captures multi-scale dependencies through three complementary learning objectives.<n> TokenUnify integrates random token prediction, next-token prediction, and next-all token prediction to create a comprehensive representational space.<n>We also introduce a large-scale EM dataset with 1.2 billion annotated voxels, offering ideal long-sequence visual data with spatial continuity.
arXiv Detail & Related papers (2024-05-27T05:45:51Z)
Diffusion-Based Neural Network Weights Generation [80.89706112736353]
D2NWG is a diffusion-based neural network weights generation technique that efficiently produces high-performing weights for transfer learning. Our method extends generative hyper-representation learning to recast the latent diffusion paradigm for neural network weights generation. Our approach is scalable to large architectures such as large language models (LLMs), overcoming the limitations of current parameter generation techniques.
arXiv Detail & Related papers (2024-02-28T08:34:23Z)
Learning from Temporal Spatial Cubism for Cross-Dataset Skeleton-based Action Recognition [88.34182299496074]
Action labels are only available on a source dataset, but unavailable on a target dataset in the training stage. We utilize a self-supervision scheme to reduce the domain shift between two skeleton-based action datasets. By segmenting and permuting temporal segments or human body parts, we design two self-supervised learning classification tasks.
arXiv Detail & Related papers (2022-07-17T07:05:39Z)

This list is automatically generated from the titles and abstracts of the papers in this site.