Learning to reason about rare diseases through retrieval-augmented agents
- URL: http://arxiv.org/abs/2511.04720v1
- Date: Thu, 06 Nov 2025 10:27:52 GMT
- Title: Learning to reason about rare diseases through retrieval-augmented agents
- Authors: Ha Young Kim, Jun Li, Ana Beatriz Solana, Carolin M. Pirkl, Benedikt Wiestler, Julia A. Schnabel, Cosmin I. Bercea,
- Abstract summary: RADAR is an agentic system for rare disease detection in brain MRI.<n>Our approach uses AI agents with access to external medical knowledge by embedding both case reports and literature.<n>The agent retrieves clinically relevant evidence to guide diagnostic decision making on unseen diseases, without the need of additional training.
- Score: 10.380898352925213
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Rare diseases represent the long tail of medical imaging, where AI models often fail due to the scarcity of representative training data. In clinical workflows, radiologists frequently consult case reports and literature when confronted with unfamiliar findings. Following this line of reasoning, we introduce RADAR, Retrieval Augmented Diagnostic Reasoning Agents, an agentic system for rare disease detection in brain MRI. Our approach uses AI agents with access to external medical knowledge by embedding both case reports and literature using sentence transformers and indexing them with FAISS to enable efficient similarity search. The agent retrieves clinically relevant evidence to guide diagnostic decision making on unseen diseases, without the need of additional training. Designed as a model-agnostic reasoning module, RADAR can be seamlessly integrated with diverse large language models, consistently improving their rare pathology recognition and interpretability. On the NOVA dataset comprising 280 distinct rare diseases, RADAR achieves up to a 10.2% performance gain, with the strongest improvements observed for open source models such as DeepSeek. Beyond accuracy, the retrieved examples provide interpretable, literature grounded explanations, highlighting retrieval-augmented reasoning as a powerful paradigm for low-prevalence conditions in medical imaging.
Related papers
- An Explainable Hybrid AI Framework for Enhanced Tuberculosis and Symptom Detection [55.35661671061754]
Tuberculosis remains a critical global health issue, particularly in resource-limited and remote areas.<n>We propose a framework which enhances disease and symptom detection on chest X-rays by integrating two supervised heads and a self-supervised head.<n>Our model achieves an accuracy of 98.85% for distinguishing between COVID-19, tuberculosis, and normal cases, and a macro-F1 score of 90.09% for multilabel symptom detection.
arXiv Detail & Related papers (2025-10-21T17:18:55Z) - AI-Driven Radiology Report Generation for Traumatic Brain Injuries [0.0]
We propose an AI-based approach for automatic radiology report generation tailored to cranial trauma cases.<n>Our model integrates an AC-BiFPN with a Transformer architecture to capture and process complex medical imaging data.<n>We evaluate the performance of our model on the RSNA Intracranial Hemorrhage Detection dataset.
arXiv Detail & Related papers (2025-10-09T17:39:04Z) - RAD: Towards Trustworthy Retrieval-Augmented Multi-modal Clinical Diagnosis [56.373297358647655]
Retrieval-Augmented Diagnosis (RAD) is a novel framework that injects external knowledge into multimodal models directly on downstream tasks.<n>RAD operates through three key mechanisms: retrieval and refinement of disease-centered knowledge from multiple medical sources, a guideline-enhanced contrastive loss transformer, and a dual decoder.
arXiv Detail & Related papers (2025-09-24T10:36:14Z) - End-to-End Agentic RAG System Training for Traceable Diagnostic Reasoning [52.12425911708585]
Deep-DxSearch is an agentic RAG system trained end-to-end with reinforcement learning (RL)<n>In Deep-DxSearch, we first construct a large-scale medical retrieval corpus comprising patient records and reliable medical knowledge sources.<n> Experiments demonstrate that our end-to-end RL training framework consistently outperforms prompt-engineering and training-free RAG approaches.
arXiv Detail & Related papers (2025-08-21T17:42:47Z) - Clinically-guided Data Synthesis for Laryngeal Lesion Detection [2.573786844054239]
This study introduces a novel approach that exploits a Latent Diffusion Model (LDM) coupled with a ControlNet adapter to generate laryngeal endoscopic image-annotation pairs.<n>The proposed approach can be leveraged to expand training datasets for CADx/e models, empowering the assessment process in laryngology.
arXiv Detail & Related papers (2025-08-08T09:55:54Z) - Agentic large language models improve retrieval-based radiology question answering [4.208637377704778]
radiology Retrieval and Reasoning (RaR) is a multi-step retrieval and reasoning framework for radiology question answering.<n>RaR significantly improved mean diagnostic accuracy over zero-shot prompting and conventional online RAG.<n>RaR retrieval reduced hallucinations (mean 9.4%) and retrieved clinically relevant context in 46% of cases.
arXiv Detail & Related papers (2025-08-01T16:18:52Z) - Explainable AI for Autism Diagnosis: Identifying Critical Brain Regions Using fMRI Data [0.29687381456163997]
Early diagnosis and intervention for Autism Spectrum Disorder (ASD) has been shown to significantly improve the quality of life of autistic individuals.<n>There is a need for objective biomarkers of ASD which can help improve diagnostic accuracy.<n>Deep learning (DL) has achieved outstanding performance in diagnosing diseases and conditions from medical imaging data.<n>This research aims to improve the accuracy and interpretability of ASD diagnosis by creating a DL model that can not only accurately classify ASD but also provide explainable insights into its working.
arXiv Detail & Related papers (2024-09-19T23:08:09Z) - MAGDA: Multi-agent guideline-driven diagnostic assistance [43.15066219293877]
In emergency departments, rural hospitals, or clinics in less developed regions, clinicians often lack fast image analysis by trained radiologists.
In this work, we introduce a new approach for zero-shot guideline-driven decision support.
We model a system of multiple LLM agents augmented with a contrastive vision-language model that collaborate to reach a patient diagnosis.
arXiv Detail & Related papers (2024-09-10T09:10:30Z) - Deep Reinforcement Learning Framework for Thoracic Diseases
Classification via Prior Knowledge Guidance [49.87607548975686]
The scarcity of labeled data for related diseases poses a huge challenge to an accurate diagnosis.
We propose a novel deep reinforcement learning framework, which introduces prior knowledge to direct the learning of diagnostic agents.
Our approach's performance was demonstrated using the well-known NIHX-ray 14 and CheXpert datasets.
arXiv Detail & Related papers (2023-06-02T01:46:31Z) - Exploring and Distilling Posterior and Prior Knowledge for Radiology
Report Generation [55.00308939833555]
The PPKED includes three modules: Posterior Knowledge Explorer (PoKE), Prior Knowledge Explorer (PrKE) and Multi-domain Knowledge Distiller (MKD)
PoKE explores the posterior knowledge, which provides explicit abnormal visual regions to alleviate visual data bias.
PrKE explores the prior knowledge from the prior medical knowledge graph (medical knowledge) and prior radiology reports (working experience) to alleviate textual data bias.
arXiv Detail & Related papers (2021-06-13T11:10:02Z) - Variational Knowledge Distillation for Disease Classification in Chest
X-Rays [102.04931207504173]
We propose itvariational knowledge distillation (VKD), which is a new probabilistic inference framework for disease classification based on X-rays.
We demonstrate the effectiveness of our method on three public benchmark datasets with paired X-ray images and EHRs.
arXiv Detail & Related papers (2021-03-19T14:13:56Z) - Query-Focused EHR Summarization to Aid Imaging Diagnosis [22.21438906817433]
We propose and evaluate models that extract relevant text snippets from patient records to provide a rough case summary.
We use groups of International Classification of Diseases (ICD) codes observed in 'future' records as noisy proxies for 'downstream' diagnoses.
We train (via distant supervision) and evaluate variants of this model on EHR data from Brigham and Women's Hospital in Boston and MIMIC-III.
arXiv Detail & Related papers (2020-04-09T16:32:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.