Related papers: Med-LEGO: Editing and Adapting toward Generalist Medical Image Diagnosis

Med-LEGO: Editing and Adapting toward Generalist Medical Image Diagnosis

URL: http://arxiv.org/abs/2503.01164v1
Date: Mon, 03 Mar 2025 04:27:11 GMT
Title: Med-LEGO: Editing and Adapting toward Generalist Medical Image Diagnosis
Authors: Yitao Zhu, Yuan Yin, Jiaming Li, Mengjie Xu, Zihao Zhao, Honglin Xiong, Sheng Wang, Qian Wang,
Abstract summary: Med-LEGO is a training-free framework that enables the seamless integration or updating of a generalist CAD model.<n>Our experiments demonstrate that Med-LEGO outperforms existing methods in both cross-domain and in-domain medical tasks.
Score: 17.10843389390131
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The adoption of visual foundation models has become a common practice in computer-aided diagnosis (CAD). While these foundation models provide a viable solution for creating generalist medical AI, privacy concerns make it difficult to pre-train or continuously update such models across multiple domains and datasets, leading many studies to focus on specialist models. To address this challenge, we propose Med-LEGO, a training-free framework that enables the seamless integration or updating of a generalist CAD model by combining multiple specialist models, similar to assembling LEGO bricks. Med-LEGO enhances LoRA (low-rank adaptation) by incorporating singular value decomposition (SVD) to efficiently capture the domain expertise of each specialist model with minimal additional parameters. By combining these adapted weights through simple operations, Med-LEGO allows for the easy integration or modification of specific diagnostic capabilities without the need for original data or retraining. Finally, the combined model can be further adapted to new diagnostic tasks, making it a versatile generalist model. Our extensive experiments demonstrate that Med-LEGO outperforms existing methods in both cross-domain and in-domain medical tasks while using only 0.18% of full model parameters. These merged models show better convergence and generalization to new tasks, providing an effective path toward generalist medical AI.

Related papers

Towards All-in-One Medical Image Re-Identification [34.74569001275221]
Medical image re-identification (MedReID) is under-explored so far, despite its critical applications in personalized healthcare and privacy protection. We introduce a thorough benchmark and a unified model for this problem. We deploy the proposed MedReID technique to two real-world applications, history-augmented personalized diagnosis and medical privacy protection.
arXiv Detail & Related papers (2025-03-11T08:35:00Z)
KA$^2$ER: Knowledge Adaptive Amalgamation of ExpeRts for Medical Images Segmentation [5.807887214293438]
We propose an adaptive amalgamation knowledge framework that aims to train a versatile foundation model to handle the joint goals of multiple expert models. In particular, we first train an nnUNet-based expert model for each task, and reuse the pre-trained SwinUNTER as the target foundation model. Within the hidden layer, the hierarchical attention mechanisms are designed to achieve adaptive merging of the target model to the hidden layer feature knowledge of all experts.
arXiv Detail & Related papers (2024-10-28T14:49:17Z)
Repurposing Foundation Model for Generalizable Medical Time Series Classification [16.21546283978257]
FORMED is a foundation classification model that leverages a pre-trained backbone. It can adapt seamlessly to unseen MedTS datasets, regardless of the number of channels, sample lengths, or medical tasks. Our results highlight FORMED as a versatile and scalable model for a wide range of MedTS classification tasks, positioning it as a strong foundation model for future research in MedTS analysis.
arXiv Detail & Related papers (2024-10-03T23:50:04Z)
LoRKD: Low-Rank Knowledge Decomposition for Medical Foundation Models [59.961172635689664]
"Knowledge Decomposition" aims to improve the performance on specific medical tasks. We propose a novel framework named Low-Rank Knowledge Decomposition (LoRKD) LoRKD explicitly separates gradients from different tasks by incorporating low-rank expert modules and efficient knowledge separation convolution.
arXiv Detail & Related papers (2024-09-29T03:56:21Z)
FEDKIM: Adaptive Federated Knowledge Injection into Medical Foundation Models [54.09244105445476]
This study introduces a novel knowledge injection approach, FedKIM, to scale the medical foundation model within a federated learning framework. FedKIM leverages lightweight local models to extract healthcare knowledge from private data and integrates this knowledge into a centralized foundation model. Our experiments across twelve tasks in seven modalities demonstrate the effectiveness of FedKIM in various settings.
arXiv Detail & Related papers (2024-08-17T15:42:29Z)
Med-MoE: Mixture of Domain-Specific Experts for Lightweight Medical Vision-Language Models [17.643421997037514]
We propose a novel framework that tackles both discriminative and generative multimodal medical tasks. The learning of Med-MoE consists of three steps: multimodal medical alignment, instruction tuning and routing, and domain-specific MoE tuning. Our model can achieve performance superior to or on par with state-of-the-art baselines.
arXiv Detail & Related papers (2024-04-16T02:35:17Z)
Towards a clinically accessible radiology foundation model: open-access and lightweight, with automated evaluation [113.5002649181103]
Training open-source small multimodal models (SMMs) to bridge competency gaps for unmet clinical needs in radiology. For training, we assemble a large dataset of over 697 thousand radiology image-text pairs. For evaluation, we propose CheXprompt, a GPT-4-based metric for factuality evaluation, and demonstrate its parity with expert evaluation. The inference of LlaVA-Rad is fast and can be performed on a single V100 GPU in private settings, offering a promising state-of-the-art tool for real-world clinical applications.
arXiv Detail & Related papers (2024-03-12T18:12:02Z)
OpenMEDLab: An Open-source Platform for Multi-modality Foundation Models in Medicine [55.29668193415034]
We present OpenMEDLab, an open-source platform for multi-modality foundation models. It encapsulates solutions of pioneering attempts in prompting and fine-tuning large language and vision models for frontline clinical and bioinformatic applications. It opens access to a group of pre-trained foundation models for various medical image modalities, clinical text, protein engineering, etc.
arXiv Detail & Related papers (2024-02-28T03:51:02Z)
Towards Medical Artificial General Intelligence via Knowledge-Enhanced Multimodal Pretraining [121.89793208683625]
Medical artificial general intelligence (MAGI) enables one foundation model to solve different medical tasks. We propose a new paradigm called Medical-knedge-enhanced mulTimOdal pretRaining (MOTOR)
arXiv Detail & Related papers (2023-04-26T01:26:19Z)
Domain Generalization on Medical Imaging Classification using Episodic Training with Task Augmentation [62.49837463676111]
We propose a novel scheme of episodic training with task augmentation on medical imaging classification. Motivated by the limited number of source domains in real-world medical deployment, we consider the unique task-level overfitting.
arXiv Detail & Related papers (2021-06-13T03:56:59Z)

This list is automatically generated from the titles and abstracts of the papers in this site.