Related papers: Specializing Foundation Models via Mixture of Low-Rank Experts for Comprehensive Head CT Analysis

Specializing Foundation Models via Mixture of Low-Rank Experts for Comprehensive Head CT Analysis

URL: http://arxiv.org/abs/2603.00675v1
Date: Sat, 28 Feb 2026 14:32:38 GMT
Title: Specializing Foundation Models via Mixture of Low-Rank Experts for Comprehensive Head CT Analysis
Authors: Youngjin Yoo, Han Liu, Bogdan Georgescu, Yanbo Zhang, Sasa Grbic, Michael Baumgartner, Thomas J. Re, Jyotipriya Das, Poikavila Ullaskrishnan, Eva Eibenberger, Andrei Chekkoury, Uttam K. Bodanapally, Savvas Nicolaou, Pina C. Sanelli, Thomas J. Schroeppel, Yvonne W. Lui, Eli Gibson,
Abstract summary: We propose a Mixture of Low-Rank Experts (MoLRE) framework that extends LoRA with multiple specialized low-rank adapters and unsupervised soft routing.<n>We present a benchmark of MoLRE across six state-of-the-art medical imaging foundation models spanning 2D and 3D architectures, general-domain, medical-domain, and head CT-specific pretraining, and model sizes ranging from 7M to 431M parameters.
Score: 6.04562866374803
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Foundation models pre-trained on large-scale datasets demonstrate strong transfer learning capabilities; however, their adaptation to complex multi-label diagnostic tasks-such as comprehensive head CT finding detection-remains understudied. Standard parameter-efficient fine-tuning methods such as LoRA apply uniform adaptations across pathology types, which may limit performance for diverse medical findings. We propose a Mixture of Low-Rank Experts (MoLRE) framework that extends LoRA with multiple specialized low-rank adapters and unsupervised soft routing. This approach enables conditional feature adaptation with less than 0.5% additional parameters and without explicit pathology supervision. We present a comprehensive benchmark of MoLRE across six state-of-the-art medical imaging foundation models spanning 2D and 3D architectures, general-domain, medical-domain, and head CT-specific pretraining, and model sizes ranging from 7M to 431M parameters. Using over 70,000 non-contrast head CT scans with 75 annotated findings-including hemorrhage, infarction, trauma, mass lesions, structural abnormalities, and chronic changes-our experiments demonstrate consistent performance improvements across all models. Gains vary substantially: general-purpose and medical-domain models show the largest improvements (DINOv3-Base: +4.6%; MedGemma: +4.3%), whereas 3D CT-specialized or very large models show more modest gains (+0.2-1.3%). The combination of MoLRE and MedGemma achieves the highest average detection AUC of 0.917. These findings highlight the importance of systematic benchmarking on target clinical tasks, as pretraining domain, architecture, and model scale interact in non-obvious ways.

Related papers

3D-TDA - Topological feature extraction from 3D images for Alzheimer's disease classification [0.8500709198102238]
We propose a novel feature extraction method using persistent homology to analyze structural MRI of the brain.<n>This approach converts topological features into powerful feature vectors through Betti functions.<n>Our model outperforms state-of-the-art deep learning models in both binary and three-class classification tasks.
arXiv Detail & Related papers (2025-11-11T17:48:28Z)
Cross-Modal Fine-Tuning of 3D Convolutional Foundation Models for ADHD Classification with Low-Rank Adaptation [21.53698715497362]
Early diagnosis of attention-deficit/hyperactivity disorder (ADHD) in children plays a crucial role in improving outcomes in education and mental health.<n>We propose a novel parameter-efficient transfer learning approach that adapts a large-scale 3D convolutional foundation model, pre-trained on CT images, to an MRI-based ADHD classification task.<n>Our method introduces Low-Rank Adaptation (LoRA) in 3D by factorizing 3D convolutional kernels into 2D low-rank updates, dramatically reducing trainable parameters while achieving superior performance.
arXiv Detail & Related papers (2025-11-08T23:29:28Z)
A Novel Multi-branch ConvNeXt Architecture for Identifying Subtle Pathological Features in CT Scans [1.2461503242570642]
This paper introduces a novel multi-branch ConvNeXt architecture designed specifically for the nuanced challenges of medical image analysis.<n>The proposed model incorporates a rigorous end-to-end pipeline, from meticulous data preprocessing to augmentation to a disciplined two-phase training strategy.<n> Experimental results demonstrate a superior performance on the validation set, achieving a final ROC-AUC of 0.9937, a validation accuracy of 0.9757, and an F1-score of 0.9825 for COVID-19 cases.
arXiv Detail & Related papers (2025-10-10T08:00:46Z)
Parameter-efficient fine-tuning (PEFT) of Vision Foundation Models for Atypical Mitotic Figure Classification [0.0]
Atypical mitotic figures (AMFs) are rare abnormal cell divisions associated with tumor aggressiveness and poor prognosis.<n>The MID 2025OG challenge introduced a dedicated track for atypical mitosis classification.<n>We investigated the use of large vision foundation models, including Virchow, Virchow2, and UNI, with Low-Rank Adaptation (LoRA) for parameter-efficient fine-tuning.
arXiv Detail & Related papers (2025-09-21T05:46:54Z)
Comprehensive Multimodal Deep Learning Survival Prediction Enabled by a Transformer Architecture: A Multicenter Study in Glioblastoma [4.578027879885667]
This research aims to improve glioblastoma survival prediction by integrating MR images, clinical and molecular-pathologic data in a transformer-based deep learning model. The model employs self-supervised learning techniques to effectively encode the high-dimensional MRI input for integration with non-imaging data using cross-attention.
arXiv Detail & Related papers (2024-05-21T17:44:48Z)
Towards a clinically accessible radiology foundation model: open-access and lightweight, with automated evaluation [113.5002649181103]
Training open-source small multimodal models (SMMs) to bridge competency gaps for unmet clinical needs in radiology. For training, we assemble a large dataset of over 697 thousand radiology image-text pairs. For evaluation, we propose CheXprompt, a GPT-4-based metric for factuality evaluation, and demonstrate its parity with expert evaluation. The inference of LlaVA-Rad is fast and can be performed on a single V100 GPU in private settings, offering a promising state-of-the-art tool for real-world clinical applications.
arXiv Detail & Related papers (2024-03-12T18:12:02Z)
The effect of data augmentation and 3D-CNN depth on Alzheimer's Disease detection [51.697248252191265]
This work summarizes and strictly observes best practices regarding data handling, experimental design, and model evaluation. We focus on Alzheimer's Disease (AD) detection, which serves as a paradigmatic example of challenging problem in healthcare. Within this framework, we train predictive 15 models, considering three different data augmentation strategies and five distinct 3D CNN architectures.
arXiv Detail & Related papers (2023-09-13T10:40:41Z)
Classification of lung cancer subtypes on CT images with synthetic pathological priors [41.75054301525535]
Cross-scale associations exist in the image patterns between the same case's CT images and its pathological images. We propose self-generating hybrid feature network (SGHF-Net) for accurately classifying lung cancer subtypes on CT images.
arXiv Detail & Related papers (2023-08-09T02:04:05Z)
Federated Learning Enables Big Data for Rare Cancer Boundary Detection [98.5549882883963]
We present findings from the largest Federated ML study to-date, involving data from 71 healthcare institutions across 6 continents. We generate an automatic tumor boundary detector for the rare disease of glioblastoma. We demonstrate a 33% improvement over a publicly trained model to delineate the surgically targetable tumor, and 23% improvement over the tumor's entire extent.
arXiv Detail & Related papers (2022-04-22T17:27:00Z)
Deep Implicit Statistical Shape Models for 3D Medical Image Delineation [47.78425002879612]
3D delineation of anatomical structures is a cardinal goal in medical imaging analysis. Prior to deep learning, statistical shape models that imposed anatomical constraints and produced high quality surfaces were a core technology. We present deep implicit statistical shape models (DISSMs), a new approach to delineation that marries the representation power of CNNs with the robustness of SSMs.
arXiv Detail & Related papers (2021-04-07T01:15:06Z)
Scale-Space Autoencoders for Unsupervised Anomaly Segmentation in Brain MRI [47.26574993639482]
We show improved anomaly segmentation performance and the general capability to obtain much more crisp reconstructions of input data at native resolution. The modeling of the laplacian pyramid further enables the delineation and aggregation of lesions at multiple scales.
arXiv Detail & Related papers (2020-06-23T09:20:42Z)

This list is automatically generated from the titles and abstracts of the papers in this site.