Cross-Modal Fine-Tuning of 3D Convolutional Foundation Models for ADHD Classification with Low-Rank Adaptation
- URL: http://arxiv.org/abs/2511.06163v1
- Date: Sat, 08 Nov 2025 23:29:28 GMT
- Title: Cross-Modal Fine-Tuning of 3D Convolutional Foundation Models for ADHD Classification with Low-Rank Adaptation
- Authors: Jyun-Ping Kao, Shinyeong Rho, Shahar Lazarev, Hyun-Hae Cho, Fangxu Xing, Taehoon Shin, C. -C. Jay Kuo, Jonghye Woo,
- Abstract summary: Early diagnosis of attention-deficit/hyperactivity disorder (ADHD) in children plays a crucial role in improving outcomes in education and mental health.<n>We propose a novel parameter-efficient transfer learning approach that adapts a large-scale 3D convolutional foundation model, pre-trained on CT images, to an MRI-based ADHD classification task.<n>Our method introduces Low-Rank Adaptation (LoRA) in 3D by factorizing 3D convolutional kernels into 2D low-rank updates, dramatically reducing trainable parameters while achieving superior performance.
- Score: 21.53698715497362
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Early diagnosis of attention-deficit/hyperactivity disorder (ADHD) in children plays a crucial role in improving outcomes in education and mental health. Diagnosing ADHD using neuroimaging data, however, remains challenging due to heterogeneous presentations and overlapping symptoms with other conditions. To address this, we propose a novel parameter-efficient transfer learning approach that adapts a large-scale 3D convolutional foundation model, pre-trained on CT images, to an MRI-based ADHD classification task. Our method introduces Low-Rank Adaptation (LoRA) in 3D by factorizing 3D convolutional kernels into 2D low-rank updates, dramatically reducing trainable parameters while achieving superior performance. In a five-fold cross-validated evaluation on a public diffusion MRI database, our 3D LoRA fine-tuning strategy achieved state-of-the-art results, with one model variant reaching 71.9% accuracy and another attaining an AUC of 0.716. Both variants use only 1.64 million trainable parameters (over 113x fewer than a fully fine-tuned foundation model). Our results represent one of the first successful cross-modal (CT-to-MRI) adaptations of a foundation model in neuroimaging, establishing a new benchmark for ADHD classification while greatly improving efficiency.
Related papers
- Specializing Foundation Models via Mixture of Low-Rank Experts for Comprehensive Head CT Analysis [6.04562866374803]
We propose a Mixture of Low-Rank Experts (MoLRE) framework that extends LoRA with multiple specialized low-rank adapters and unsupervised soft routing.<n>We present a benchmark of MoLRE across six state-of-the-art medical imaging foundation models spanning 2D and 3D architectures, general-domain, medical-domain, and head CT-specific pretraining, and model sizes ranging from 7M to 431M parameters.
arXiv Detail & Related papers (2026-02-28T14:32:38Z) - 3D-TDA - Topological feature extraction from 3D images for Alzheimer's disease classification [0.8500709198102238]
We propose a novel feature extraction method using persistent homology to analyze structural MRI of the brain.<n>This approach converts topological features into powerful feature vectors through Betti functions.<n>Our model outperforms state-of-the-art deep learning models in both binary and three-class classification tasks.
arXiv Detail & Related papers (2025-11-11T17:48:28Z) - Beyond Data Scarcity Optimizing R3GAN for Medical Image Generation from Small Datasets [0.044780965967547055]
This work investigates how generative adversarial networks (GANs) can be optimized for small datasets to generate realistic and diagnostically meaningful images.<n>Based on systematic experiments with R3GAN, we established effective training strategies and designed an optimized configuration for 256x256-resolution datasets.<n>The generated samples were used to balance an imbalanced embryo dataset, leading to substantial improvement in classification performance.
arXiv Detail & Related papers (2025-10-29T13:03:36Z) - 3DViT-GAT: A Unified Atlas-Based 3D Vision Transformer and Graph Learning Framework for Major Depressive Disorder Detection Using Structural MRI Data [0.0]
Major depressive disorder (MDD) is a prevalent mental health condition that negatively impacts both individual well-being and global public health.<n>This paper develops a unified pipeline that utilizes Vision Transformers (ViTs) for extracting 3D region embeddings from sMRI data and Graph Neural Network (GNN) for classification.
arXiv Detail & Related papers (2025-09-15T17:10:39Z) - EffNetViTLoRA: An Efficient Hybrid Deep Learning Approach for Alzheimer's Disease Diagnosis [2.220152876549942]
Alzheimer's disease (AD) is one of the most prevalent neurodegenerative disorders worldwide.<n>EffNetViTLoRA is an end-to-end model for AD diagnosis using the Alzheimer's Disease Neuroimaging Initiative (ADNI) Magnetic Resonance Imaging (MRI) dataset.
arXiv Detail & Related papers (2025-08-26T18:22:28Z) - Towards a general-purpose foundation model for fMRI analysis [58.06455456423138]
We introduce NeuroSTORM, a framework that learns from 4D fMRI volumes and enables efficient knowledge transfer across diverse applications.<n>NeuroSTORM is pre-trained on 28.65 million fMRI frames (>9,000 hours) from over 50,000 subjects across multiple centers and ages 5 to 100.<n>It outperforms existing methods across five tasks: age/gender prediction, phenotype prediction, disease diagnosis, fMRI-to-image retrieval, and task-based fMRI.
arXiv Detail & Related papers (2025-06-11T23:51:01Z) - 3D Brain MRI Classification for Alzheimer Diagnosis Using CNN with Data Augmentation [0.0]
A three-dimensional convolutional neural network was developed to classify T1-weighted brain scans as healthy or Alzheimer.<n>The network comprises 3D convolution, pooling, batch normalization, dense ReLU layers, and a sigmoid output.
arXiv Detail & Related papers (2025-05-07T03:32:25Z) - Brain Tumor Classification on MRI in Light of Molecular Markers [56.99710477905796]
Co-deletion of the 1p/19q gene is associated with clinical outcomes in low-grade gliomas.<n>This study aims to utilize a specially MRI-based convolutional neural network for brain cancer detection.
arXiv Detail & Related papers (2024-09-29T07:04:26Z) - The effect of data augmentation and 3D-CNN depth on Alzheimer's Disease
detection [51.697248252191265]
This work summarizes and strictly observes best practices regarding data handling, experimental design, and model evaluation.
We focus on Alzheimer's Disease (AD) detection, which serves as a paradigmatic example of challenging problem in healthcare.
Within this framework, we train predictive 15 models, considering three different data augmentation strategies and five distinct 3D CNN architectures.
arXiv Detail & Related papers (2023-09-13T10:40:41Z) - Video and Synthetic MRI Pre-training of 3D Vision Architectures for
Neuroimage Analysis [3.208731414009847]
Transfer learning involves pre-training deep learning models on a large corpus of data for adaptation to specific tasks.
We benchmarked vision transformers (ViTs) and convolutional neural networks (CNNs) with varied upstream pre-training approaches.
The resulting pre-trained models can be adapted to a range of downstream tasks, even when training data for the target task is limited.
arXiv Detail & Related papers (2023-09-09T00:33:23Z) - Cross-Site Severity Assessment of COVID-19 from CT Images via Domain
Adaptation [64.59521853145368]
Early and accurate severity assessment of Coronavirus disease 2019 (COVID-19) based on computed tomography (CT) images offers a great help to the estimation of intensive care unit event.
To augment the labeled data and improve the generalization ability of the classification model, it is necessary to aggregate data from multiple sites.
This task faces several challenges including class imbalance between mild and severe infections, domain distribution discrepancy between sites, and presence of heterogeneous features.
arXiv Detail & Related papers (2021-09-08T07:56:51Z) - Deep Implicit Statistical Shape Models for 3D Medical Image Delineation [47.78425002879612]
3D delineation of anatomical structures is a cardinal goal in medical imaging analysis.
Prior to deep learning, statistical shape models that imposed anatomical constraints and produced high quality surfaces were a core technology.
We present deep implicit statistical shape models (DISSMs), a new approach to delineation that marries the representation power of CNNs with the robustness of SSMs.
arXiv Detail & Related papers (2021-04-07T01:15:06Z) - Revisiting 3D Context Modeling with Supervised Pre-training for
Universal Lesion Detection in CT Slices [48.85784310158493]
We propose a Modified Pseudo-3D Feature Pyramid Network (MP3D FPN) to efficiently extract 3D context enhanced 2D features for universal lesion detection in CT slices.
With the novel pre-training method, the proposed MP3D FPN achieves state-of-the-art detection performance on the DeepLesion dataset.
The proposed 3D pre-trained weights can potentially be used to boost the performance of other 3D medical image analysis tasks.
arXiv Detail & Related papers (2020-12-16T07:11:16Z) - Fader Networks for domain adaptation on fMRI: ABIDE-II study [68.5481471934606]
We use 3D convolutional autoencoders to build the domain irrelevant latent space image representation and demonstrate this method to outperform existing approaches on ABIDE data.
arXiv Detail & Related papers (2020-10-14T16:50:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.