Towards Conversational AI for Disease Management
- URL: http://arxiv.org/abs/2503.06074v1
- Date: Sat, 08 Mar 2025 05:48:58 GMT
- Title: Towards Conversational AI for Disease Management
- Authors: Anil Palepu, Valentin Liévin, Wei-Hung Weng, Khaled Saab, David Stutz, Yong Cheng, Kavita Kulkarni, S. Sara Mahdavi, Joëlle Barral, Dale R. Webster, Katherine Chou, Avinatan Hassidim, Yossi Matias, James Manyika, Ryutaro Tanno, Vivek Natarajan, Adam Rodman, Tao Tu, Alan Karthikesalingam, Mike Schaekermann,
- Abstract summary: Articulate Medical Intelligence Explorer (AMIE) is an agentic system optimised for clinical management and dialogue.<n>AMIE is non-inferior to PCPs in management reasoning as assessed by specialist physicians.<n>AMIE's strong performance across evaluations marks a significant step towards conversational AI as a tool in disease management.
- Score: 29.189384095061722
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: While large language models (LLMs) have shown promise in diagnostic dialogue, their capabilities for effective management reasoning - including disease progression, therapeutic response, and safe medication prescription - remain under-explored. We advance the previously demonstrated diagnostic capabilities of the Articulate Medical Intelligence Explorer (AMIE) through a new LLM-based agentic system optimised for clinical management and dialogue, incorporating reasoning over the evolution of disease and multiple patient visit encounters, response to therapy, and professional competence in medication prescription. To ground its reasoning in authoritative clinical knowledge, AMIE leverages Gemini's long-context capabilities, combining in-context retrieval with structured reasoning to align its output with relevant and up-to-date clinical practice guidelines and drug formularies. In a randomized, blinded virtual Objective Structured Clinical Examination (OSCE) study, AMIE was compared to 21 primary care physicians (PCPs) across 100 multi-visit case scenarios designed to reflect UK NICE Guidance and BMJ Best Practice guidelines. AMIE was non-inferior to PCPs in management reasoning as assessed by specialist physicians and scored better in both preciseness of treatments and investigations, and in its alignment with and grounding of management plans in clinical guidelines. To benchmark medication reasoning, we developed RxQA, a multiple-choice question benchmark derived from two national drug formularies (US, UK) and validated by board-certified pharmacists. While AMIE and PCPs both benefited from the ability to access external drug information, AMIE outperformed PCPs on higher difficulty questions. While further research would be needed before real-world translation, AMIE's strong performance across evaluations marks a significant step towards conversational AI as a tool in disease management.
Related papers
- Quantifying the Reasoning Abilities of LLMs on Real-world Clinical Cases [48.87360916431396]
We introduce MedR-Bench, a benchmarking dataset of 1,453 structured patient cases, annotated with reasoning references.<n>We propose a framework encompassing three critical examination recommendation, diagnostic decision-making, and treatment planning, simulating the entire patient care journey.<n>Using this benchmark, we evaluate five state-of-the-art reasoning LLMs, including DeepSeek-R1, OpenAI-o3-mini, and Gemini-2.0-Flash Thinking, etc.
arXiv Detail & Related papers (2025-03-06T18:35:39Z) - Addressing Overprescribing Challenges: Fine-Tuning Large Language Models for Medication Recommendation Tasks [46.95099594570405]
Medication recommendation systems have garnered attention within healthcare for their potential to deliver personalized and efficacious drug combinations based on patient's clinical data.
Existing methodologies encounter challenges in adapting to diverse Electronic Health Records (EHR) systems.
We propose Language-Assisted Medication Recommendation (LAMO), which employs a parameter-efficient fine-tuning approach to tailor open-source LLMs for optimal performance in medication recommendation scenarios.
arXiv Detail & Related papers (2025-03-05T17:28:16Z) - Citrus: Leveraging Expert Cognitive Pathways in a Medical Language Model for Advanced Medical Decision Support [22.40301339126307]
We introduce Citrus, a medical language model that bridges the gap between clinical expertise and AI reasoning.<n>The model is trained on a large corpus of simulated expert disease reasoning data.<n>We release the last-stage training data, including a custom-built medical diagnostic dialogue dataset.
arXiv Detail & Related papers (2025-02-25T15:05:12Z) - Natural Language-Assisted Multi-modal Medication Recommendation [97.07805345563348]
We introduce the Natural Language-Assisted Multi-modal Medication Recommendation(NLA-MMR)<n>The NLA-MMR is a multi-modal alignment framework designed to learn knowledge from the patient view and medication view jointly.<n>In this vein, we employ pretrained language models(PLMs) to extract in-domain knowledge regarding patients and medications.
arXiv Detail & Related papers (2025-01-13T09:51:50Z) - Towards Next-Generation Medical Agent: How o1 is Reshaping Decision-Making in Medical Scenarios [46.729092855387165]
We study the choice of the backbone LLM for medical AI agents, which is the foundation for the agent's overall reasoning and action generation.<n>Our findings demonstrate o1's ability to enhance diagnostic accuracy and consistency, paving the way for smarter, more responsive AI tools.
arXiv Detail & Related papers (2024-11-16T18:19:53Z) - Exploring Large Language Models for Specialist-level Oncology Care [17.34069859182619]
We probe the performance of AMIE, a research conversational diagnostic AI system, in the subspecialist domain of breast oncology care.
We curated a set of 50 synthetic breast cancer vignettes representing a range of treatment-naive and treatment-refractory cases.
We developed a detailed clinical rubric for evaluating management plans, including axes such as the quality of case summarization, safety of the proposed care plan, and recommendations for chemotherapy, radiotherapy, surgery and hormonal therapy.
arXiv Detail & Related papers (2024-11-05T18:30:13Z) - RuleAlign: Making Large Language Models Better Physicians with Diagnostic Rule Alignment [54.91736546490813]
We introduce the RuleAlign framework, designed to align Large Language Models with specific diagnostic rules.
We develop a medical dialogue dataset comprising rule-based communications between patients and physicians.
Experimental results demonstrate the effectiveness of the proposed approach.
arXiv Detail & Related papers (2024-08-22T17:44:40Z) - MedKP: Medical Dialogue with Knowledge Enhancement and Clinical Pathway
Encoding [48.348511646407026]
We introduce the Medical dialogue with Knowledge enhancement and clinical Pathway encoding framework.
The framework integrates an external knowledge enhancement module through a medical knowledge graph and an internal clinical pathway encoding via medical entities and physician actions.
arXiv Detail & Related papers (2024-03-11T10:57:45Z) - AI Hospital: Benchmarking Large Language Models in a Multi-agent Medical Interaction Simulator [69.51568871044454]
We introduce textbfAI Hospital, a framework simulating dynamic medical interactions between emphDoctor as player and NPCs.
This setup allows for realistic assessments of LLMs in clinical scenarios.
We develop the Multi-View Medical Evaluation benchmark, utilizing high-quality Chinese medical records and NPCs.
arXiv Detail & Related papers (2024-02-15T06:46:48Z) - Towards Conversational Diagnostic AI [32.84876349808714]
We introduce AMIE (Articulate Medical Intelligence Explorer), a Large Language Model (LLM) based AI system optimized for diagnostic dialogue.
AMIE uses a self-play based simulated environment with automated feedback mechanisms for scaling learning across diverse disease conditions.
AMIE demonstrated greater diagnostic accuracy and superior performance on 28 of 32 axes according to specialist physicians and 24 of 26 axes according to patient actors.
arXiv Detail & Related papers (2024-01-11T04:25:06Z) - ABiMed: An intelligent and visual clinical decision support system for
medication reviews and polypharmacy management [3.843569766201585]
The aim of ABiMed is to design an innovative clinical decision support system for medication reviews and polypharmacy management.
ABiMed associates several approaches: guidelines implementation, but the automatic extraction of patient data from the GP's electronic health record and its transfer to the pharmacist, and the visual presentation of contextualized drug knowledge using visual analytics.
arXiv Detail & Related papers (2023-12-13T11:06:45Z) - Informing clinical assessment by contextualizing post-hoc explanations
of risk prediction models in type-2 diabetes [50.8044927215346]
We consider a comorbidity risk prediction scenario and focus on contexts regarding the patients clinical state.
We employ several state-of-the-art LLMs to present contexts around risk prediction model inferences and evaluate their acceptability.
Our paper is one of the first end-to-end analyses identifying the feasibility and benefits of contextual explanations in a real-world clinical use case.
arXiv Detail & Related papers (2023-02-11T18:07:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.