Multi-agent Self-triage System with Medical Flowcharts
- URL: http://arxiv.org/abs/2511.12439v1
- Date: Sun, 16 Nov 2025 03:48:22 GMT
- Title: Multi-agent Self-triage System with Medical Flowcharts
- Authors: Yujia Liu, Sophia Yu, Hongyue Jin, Jessica Wen, Alexander Qian, Terrence Lee, Mattheus Ramsis, Gi Won Choi, Lianhui Qin, Xin Liu, Edward J. Wang,
- Abstract summary: We introduce a proof-of-concept conversational self-triage system that guides LLMs with 100 clinically validated flowcharts from the American Medical Association.<n>Performance was evaluated at scale using synthetic datasets of simulated conversations.
- Score: 36.31241490919295
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Online health resources and large language models (LLMs) are increasingly used as a first point of contact for medical decision-making, yet their reliability in healthcare remains limited by low accuracy, lack of transparency, and susceptibility to unverified information. We introduce a proof-of-concept conversational self-triage system that guides LLMs with 100 clinically validated flowcharts from the American Medical Association, providing a structured and auditable framework for patient decision support. The system leverages a multi-agent framework consisting of a retrieval agent, a decision agent, and a chat agent to identify the most relevant flowchart, interpret patient responses, and deliver personalized, patient-friendly recommendations, respectively. Performance was evaluated at scale using synthetic datasets of simulated conversations. The system achieved 95.29% top-3 accuracy in flowchart retrieval (N=2,000) and 99.10% accuracy in flowchart navigation across varied conversational styles and conditions (N=37,200). By combining the flexibility of free-text interaction with the rigor of standardized clinical protocols, this approach demonstrates the feasibility of transparent, accurate, and generalizable AI-assisted self-triage, with potential to support informed patient decision-making while improving healthcare resource utilization.
Related papers
- A Federated and Parameter-Efficient Framework for Large Language Model Training in Medicine [59.78991974851707]
Large language models (LLMs) have demonstrated strong performance on medical benchmarks, including question answering and diagnosis.<n>Most medical LLMs are trained on data from a single institution, which faces limitations in generalizability and safety in heterogeneous systems.<n>We introduce the model-agnostic and parameter-efficient federated learning framework for adapting LLMs to medical applications.
arXiv Detail & Related papers (2026-01-29T18:48:21Z) - Towards Reliable Medical LLMs: Benchmarking and Enhancing Confidence Estimation of Large Language Models in Medical Consultation [97.36081721024728]
We propose the first benchmark for assessing confidence in multi-turn interaction during realistic medical consultations.<n>Our benchmark unifies three types of medical data for open-ended diagnostic generation.<n>We present MedConf, an evidence-grounded linguistic self-assessment framework.
arXiv Detail & Related papers (2026-01-22T04:51:39Z) - DispatchMAS: Fusing taxonomy and artificial intelligence agents for emergency medical services [49.70819009392778]
Large Language Models (LLMs) and Multi-Agent Systems (MAS) offer opportunities to augment dispatchers.<n>This study aimed to develop and evaluate a taxonomy-grounded, multi-agent system for simulating realistic scenarios.
arXiv Detail & Related papers (2025-10-24T08:01:21Z) - Timely Clinical Diagnosis through Active Test Selection [49.091903570068155]
We propose ACTMED (Adaptive Clinical Test selection via Model-based Experimental Design) to better emulate real-world diagnostic reasoning.<n>LLMs act as flexible simulators, generating plausible patient state distributions and supporting belief updates without requiring structured, task-specific training data.<n>We evaluate ACTMED on real-world datasets and show it can optimize test selection to improve diagnostic accuracy, interpretability, and resource use.
arXiv Detail & Related papers (2025-10-21T18:10:45Z) - MedKGEval: A Knowledge Graph-Based Multi-Turn Evaluation Framework for Open-Ended Patient Interactions with Clinical LLMs [19.12790150016383]
We present MedKGEval, a novel multi-turn evaluation framework for clinical large language models.<n>A knowledge graph-driven patient simulation mechanism retrieves relevant medical facts from a curated knowledge graph.<n>A turn-level evaluation framework assesses each model response for clinical appropriateness, factual correctness, and safety.
arXiv Detail & Related papers (2025-10-14T07:22:26Z) - From Staff Messages to Actionable Insights: A Multi-Stage LLM Classification Framework for Healthcare Analytics [0.0]
This paper presents a framework that identifies staff message topics and classifies messages by their reasons in a multi-class fashion.<n>The best-performing model was o3, achieving 78.4% weighted F1-score and 79.2% accuracy.<n>The proposed methodology incorporates data security measures and HIPAA compliance requirements essential for healthcare environments.
arXiv Detail & Related papers (2025-09-05T20:15:52Z) - TrialMatchAI: An End-to-End AI-powered Clinical Trial Recommendation System to Streamline Patient-to-Trial Matching [0.0]
We present TrialMatchAI, an AI-powered recommendation system that automates patient-to-trial matching.<n>Built on fine-tuned, open-source large language models, TrialMatchAI ensures transparency and maintains a lightweight deployment footprint.<n>In real-world validation, 92 percent of oncology patients had at least one relevant trial retrieved within the top 20 recommendations.
arXiv Detail & Related papers (2025-05-13T12:39:06Z) - Enhancing Clinical Decision-Making: Integrating Multi-Agent Systems with Ethical AI Governance [1.0195618602298682]
We compare novel agent system designs that use modular agents to analyze laboratory results, vital signs, and clinical context.<n>We implement our agent system with the eICU database, including running lab analysis, vitals-only interpreters, and contextual reasoners agents.
arXiv Detail & Related papers (2025-03-25T05:32:43Z) - Which Client is Reliable?: A Reliable and Personalized Prompt-based Federated Learning for Medical Image Question Answering [51.26412822853409]
We present a novel personalized federated learning (pFL) method for medical visual question answering (VQA) models.
Our method introduces learnable prompts into a Transformer architecture to efficiently train it on diverse medical datasets without massive computational costs.
arXiv Detail & Related papers (2024-10-23T00:31:17Z) - Simulated patient systems are intelligent when powered by large language model-based AI agents [32.73072809937573]
We developed AIPatient, an intelligent simulated patient system powered by large language model-based AI agents.<n>The system incorporates the Retrieval Augmented Generation framework, powered by six task-specific LLM-based AI agents for complex reasoning.<n>For simulation reality, the system is also powered by the AIPatient KG (Knowledge Graph), built with de-identified real patient data.
arXiv Detail & Related papers (2024-09-27T17:17:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.