Related papers: ReclAIm: A multi-agent framework for degradation-aware performance tuning of medical imaging AI

ReclAIm: A multi-agent framework for degradation-aware performance tuning of medical imaging AI

URL: http://arxiv.org/abs/2510.17004v1
Date: Sun, 19 Oct 2025 21:02:01 GMT
Title: ReclAIm: A multi-agent framework for degradation-aware performance tuning of medical imaging AI
Authors: Eleftherios Tzanis, Michail E. Klontzas,
Abstract summary: ReclAIm is a multi-agent framework capable of autonomously monitoring, evaluating, and fine-tuning medical image classification models.<n>It successfully trains, evaluates, and maintains consistent performance of models across MRI, CT, and X-ray datasets.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Ensuring the long-term reliability of AI models in clinical practice requires continuous performance monitoring and corrective actions when degradation occurs. Addressing this need, this manuscript presents ReclAIm, a multi-agent framework capable of autonomously monitoring, evaluating, and fine-tuning medical image classification models. The system, built on a large language model core, operates entirely through natural language interaction, eliminating the need for programming expertise. ReclAIm successfully trains, evaluates, and maintains consistent performance of models across MRI, CT, and X-ray datasets. Once ReclAIm detects significant performance degradation, it autonomously executes state-of-the-art fine-tuning procedures that substantially reduce the performance gap. In cases with performance drops of up to -41.1% (MRI InceptionV3), ReclAIm managed to readjust performance metrics within 1.5% of the initial model results. ReclAIm enables automated, continuous maintenance of medical imaging AI models in a user-friendly and adaptable manner that facilitates broader adoption in both research and clinical environments.

Related papers

MedSAM-Agent: Empowering Interactive Medical Image Segmentation with Multi-turn Agentic Reinforcement Learning [53.37068897861388]
MedSAM-Agent is a framework that reformulates interactive segmentation as a multi-step autonomous decision-making process.<n>We develop a two-stage training pipeline that integrates multi-turn, end-to-end outcome verification.<n>Experiments across 6 medical modalities and 21 datasets demonstrate that MedSAM-Agent achieves state-of-the-art performance.
arXiv Detail & Related papers (2026-02-03T09:47:49Z)
Diagnostic Performance of Universal-Learning Ultrasound AI Across Multiple Organs and Tasks: the UUSIC25 Challenge [34.86849736082012]
Current ultrasound AI remains fragmented into single-task tools.<n>General-purpose AI models achieve high accuracy and efficiency across multiple tasks using a single architecture.
arXiv Detail & Related papers (2025-12-19T06:54:30Z)
An autonomous agent for auditing and improving the reliability of clinical AI models [11.225863068085266]
We introduce ModelAuditor, a self-reflective agent that converses with users.<n>ModelAuditor simulates context-dependent, clinically relevant distribution shifts.<n>It then generates interpretable reports explaining how much performance likely degrades during deployment.
arXiv Detail & Related papers (2025-07-08T07:58:52Z)
Scalable Drift Monitoring in Medical Imaging AI [37.1899538374058]
We develop MMC+, an enhanced framework for scalable drift monitoring. It builds upon the CheXstray framework that introduced real-time drift detection for medical imaging AI models. MMC+ offers a reliable and cost-effective alternative to continuous performance monitoring.
arXiv Detail & Related papers (2024-10-17T02:57:35Z)
PMT: Progressive Mean Teacher via Exploring Temporal Consistency for Semi-Supervised Medical Image Segmentation [51.509573838103854]
We propose a semi-supervised learning framework, termed Progressive Mean Teachers (PMT), for medical image segmentation. Our PMT generates high-fidelity pseudo labels by learning robust and diverse features in the training process. Experimental results on two datasets with different modalities, i.e., CT and MRI, demonstrate that our method outperforms the state-of-the-art medical image segmentation approaches.
arXiv Detail & Related papers (2024-09-08T15:02:25Z)
Adapting Visual-Language Models for Generalizable Anomaly Detection in Medical Images [68.42215385041114]
This paper introduces a novel lightweight multi-level adaptation and comparison framework to repurpose the CLIP model for medical anomaly detection. Our approach integrates multiple residual adapters into the pre-trained visual encoder, enabling a stepwise enhancement of visual features across different levels. Our experiments on medical anomaly detection benchmarks demonstrate that our method significantly surpasses current state-of-the-art models.
arXiv Detail & Related papers (2024-03-19T09:28:19Z)
Towards a clinically accessible radiology foundation model: open-access and lightweight, with automated evaluation [113.5002649181103]
Training open-source small multimodal models (SMMs) to bridge competency gaps for unmet clinical needs in radiology. For training, we assemble a large dataset of over 697 thousand radiology image-text pairs. For evaluation, we propose CheXprompt, a GPT-4-based metric for factuality evaluation, and demonstrate its parity with expert evaluation. The inference of LlaVA-Rad is fast and can be performed on a single V100 GPU in private settings, offering a promising state-of-the-art tool for real-world clinical applications.
arXiv Detail & Related papers (2024-03-12T18:12:02Z)
New Epochs in AI Supervision: Design and Implementation of an Autonomous Radiology AI Monitoring System [5.50085484902146]
We introduce novel methods for monitoring the performance of radiology AI classification models in practice. We propose two metrics - predictive divergence and temporal stability - to be used for preemptive alerts of AI performance changes.
arXiv Detail & Related papers (2023-11-24T06:29:04Z)
Robust and Efficient Medical Imaging with Self-Supervision [80.62711706785834]
We present REMEDIS, a unified representation learning strategy to improve robustness and data-efficiency of medical imaging AI. We study a diverse range of medical imaging tasks and simulate three realistic application scenarios using retrospective data.
arXiv Detail & Related papers (2022-05-19T17:34:18Z)
CheXstray: Real-time Multi-Modal Data Concordance for Drift Detection in Medical Imaging AI [1.359138408203412]
We build and test a medical imaging AI drift monitoring workflow that tracks data and model drift without contemporaneous ground truth. Key contributions include (1) proof-of-concept for medical imaging drift detection including use of VAE and domain specific statistical methods. This work has important implications for addressing the translation gap related to continuous medical imaging AI model monitoring in dynamic healthcare environments.
arXiv Detail & Related papers (2022-02-06T18:58:35Z)
Performance or Trust? Why Not Both. Deep AUC Maximization with Self-Supervised Learning for COVID-19 Chest X-ray Classifications [72.52228843498193]
In training deep learning models, a compromise often must be made between performance and trust. In this work, we integrate a new surrogate loss with self-supervised learning for computer-aided screening of COVID-19 patients.
arXiv Detail & Related papers (2021-12-14T21:16:52Z)

This list is automatically generated from the titles and abstracts of the papers in this site.