Related papers: Adaptive Reasoning and Acting in Medical Language Agents

Related papers

Medical Reasoning in the Era of LLMs: A Systematic Review of Enhancement Techniques and Applications [59.721265428780946]
Large Language Models (LLMs) in medicine have enabled impressive capabilities, yet a critical gap remains in their ability to perform systematic, transparent, and verifiable reasoning.<n>This paper provides the first systematic review of this emerging field.<n>We propose a taxonomy of reasoning enhancement techniques, categorized into training-time strategies and test-time mechanisms.
arXiv Detail & Related papers (2025-08-01T14:41:31Z)
DynamiCare: A Dynamic Multi-Agent Framework for Interactive and Open-Ended Medical Decision-Making [4.801722645791233]
DynamiCare is a novel dynamic multi-agent framework that models clinical diagnosis as a multi-round, interactive loop.<n>We demonstrate the feasibility and effectiveness of DynamiCare through extensive experiments.
arXiv Detail & Related papers (2025-07-03T13:43:10Z)
Silence is Not Consensus: Disrupting Agreement Bias in Multi-Agent LLMs via Catfish Agent for Clinical Decision Making [80.94208848596215]
We present a new concept called Catfish Agent, a role-specialized LLM designed to inject structured dissent and counter silent agreement.<n>Inspired by the catfish effect'' in organizational psychology, the Catfish Agent is designed to challenge emerging consensus to stimulate deeper reasoning.
arXiv Detail & Related papers (2025-05-27T17:59:50Z)
Learning to Be A Doctor: Searching for Effective Medical Agent Architectures [32.82784216021035]
This paper introduces a novel framework for the automated design of medical agent architectures. Motivated by the success of automated machine learning (AutoML), we define a hierarchical and expressive agent search space. Our framework conceptualizes medical agents as graph-based architectures composed of diverse, functional node types.
arXiv Detail & Related papers (2025-04-15T15:44:21Z)
Self-Evolving Multi-Agent Simulations for Realistic Clinical Interactions [16.50490537786593]
We introduce MedAgentSim, an open-source simulated clinical environment with doctor, patient, and measurement agents. Unlike prior approaches, our framework requires doctor agents to actively engage with patients through multi-turn conversations. We incorporate self improvement mechanisms that allow models to iteratively refine their diagnostic strategies.
arXiv Detail & Related papers (2025-03-28T17:59:53Z)
Structured Outputs Enable General-Purpose LLMs to be Medical Experts [50.02627258858336]
Large language models (LLMs) often struggle with open-ended medical questions. We propose a novel approach utilizing structured medical reasoning. Our approach achieves the highest Factuality Score of 85.8, surpassing fine-tuned models.
arXiv Detail & Related papers (2025-03-05T05:24:55Z)
Improving Retrospective Language Agents via Joint Policy Gradient Optimization [57.35348425288859]
RetroAct is a framework that jointly optimize both task-planning and self-reflective evolution capabilities in language agents. We develop a two-stage joint optimization process that integrates imitation learning and reinforcement learning. We conduct extensive experiments across various testing environments, demonstrating RetroAct has substantial improvements in task performance and decision-making processes.
arXiv Detail & Related papers (2025-03-03T12:54:54Z)
Medchain: Bridging the Gap Between LLM Agents and Clinical Practice through Interactive Sequential Benchmarking [58.25862290294702]
We present MedChain, a dataset of 12,163 clinical cases that covers five key stages of clinical workflow. We also propose MedChain-Agent, an AI system that integrates a feedback mechanism and a MCase-RAG module to learn from previous cases and adapt its responses.
arXiv Detail & Related papers (2024-12-02T15:25:02Z)
Towards Next-Generation Medical Agent: How o1 is Reshaping Decision-Making in Medical Scenarios [46.729092855387165]
We study the choice of the backbone LLM for medical AI agents, which is the foundation for the agent's overall reasoning and action generation. Our findings demonstrate o1's ability to enhance diagnostic accuracy and consistency, paving the way for smarter, more responsive AI tools.
arXiv Detail & Related papers (2024-11-16T18:19:53Z)
Self-Healing Machine Learning: A Framework for Autonomous Adaptation in Real-World Environments [50.310636905746975]
Real-world machine learning systems often encounter model performance degradation due to distributional shifts in the underlying data generating process. Existing approaches to addressing shifts, such as concept drift adaptation, are limited by their reason-agnostic nature. We propose self-healing machine learning (SHML) to overcome these limitations.
arXiv Detail & Related papers (2024-10-31T20:05:51Z)
AgentClinic: a multimodal agent benchmark to evaluate AI in simulated clinical environments [2.567146936147657]
We introduce AgentClinic, a multimodal agent benchmark for evaluating large language models (LLM) in simulated clinical environments. We find that solving MedQA problems in the sequential decision-making format of AgentClinic is considerably more challenging, resulting in diagnostic accuracies that can drop to below a tenth of the original accuracy.
arXiv Detail & Related papers (2024-05-13T17:38:53Z)
AI Hospital: Benchmarking Large Language Models in a Multi-agent Medical Interaction Simulator [69.51568871044454]
We introduce textbfAI Hospital, a framework simulating dynamic medical interactions between emphDoctor as player and NPCs. This setup allows for realistic assessments of LLMs in clinical scenarios. We develop the Multi-View Medical Evaluation benchmark, utilizing high-quality Chinese medical records and NPCs.
arXiv Detail & Related papers (2024-02-15T06:46:48Z)
Beyond Direct Diagnosis: LLM-based Multi-Specialist Agent Consultation for Automatic Diagnosis [30.943705201552643]
We propose a framework to model the diagnosis process in the real world by adaptively fusing probability distributions of agents over potential diseases. Our approach requires significantly less parameter updating and training time, enhancing efficiency and practical utility.
arXiv Detail & Related papers (2024-01-29T12:25:30Z)
Integrating Physician Diagnostic Logic into Large Language Models: Preference Learning from Process Feedback [19.564416963801268]
We propose an approach called preference learning from process feedback. PLPF integrates the doctor's diagnostic logic into LLMs. We show that PLPF enhances the diagnostic accuracy of the baseline model in medical conversations by 17.6%.
arXiv Detail & Related papers (2024-01-11T06:42:45Z)
Natural Language Programming in Medicine: Administering Evidence Based Clinical Workflows with Autonomous Agents Powered by Generative Large Language Models [29.05425041393475]
Generative Large Language Models (LLMs) hold significant promise in healthcare. This study assessed the potential of LLMs to function as autonomous agents in a simulated tertiary care medical center.
arXiv Detail & Related papers (2024-01-05T15:09:57Z)
Validating polyp and instrument segmentation methods in colonoscopy through Medico 2020 and MedAI 2021 Challenges [58.32937972322058]
"Medico automatic polyp segmentation (Medico 2020)" and "MedAI: Transparency in Medical Image (MedAI 2021)" competitions. We present a comprehensive summary and analyze each contribution, highlight the strength of the best-performing methods, and discuss the possibility of clinical translations of such methods into the clinic.
arXiv Detail & Related papers (2023-07-30T16:08:45Z)
TREEMENT: Interpretable Patient-Trial Matching via Personalized Dynamic Tree-Based Memory Network [54.332862955411656]
Clinical trials are critical for drug development but often suffer from expensive and inefficient patient recruitment. In recent years, machine learning models have been proposed for speeding up patient recruitment via automatically matching patients with clinical trials. We introduce a dynamic tree-based memory network model named TREEMENT to provide accurate and interpretable patient trial matching.
arXiv Detail & Related papers (2023-07-19T12:35:09Z)
Scalable Online Disease Diagnosis via Multi-Model-Fused Actor-Critic Reinforcement Learning [9.274138493400436]
For those seeking healthcare advice online, AI based dialogue agents capable of interacting with patients to perform automatic disease diagnosis are a viable option. This can be formulated as a problem of sequential feature (symptom) selection and classification for which reinforcement learning (RL) approaches have been proposed as a natural solution. We propose a Multi-Model-Fused Actor-Critic (MMF-AC) RL framework that consists of a generative actor network and a diagnostic critic network.
arXiv Detail & Related papers (2022-06-08T03:06:16Z)
Semi-Supervised Variational Reasoning for Medical Dialogue Generation [70.838542865384]
Two key characteristics are relevant for medical dialogue generation: patient states and physician actions. We propose an end-to-end variational reasoning approach to medical dialogue generation. A physician policy network composed of an action-classifier and two reasoning detectors is proposed for augmented reasoning ability.
arXiv Detail & Related papers (2021-05-13T04:14:35Z)
Adversarial Sample Enhanced Domain Adaptation: A Case Study on Predictive Modeling with Electronic Health Records [57.75125067744978]
We propose a data augmentation method to facilitate domain adaptation. adversarially generated samples are used during domain adaptation. Results confirm the effectiveness of our method and the generality on different tasks.
arXiv Detail & Related papers (2021-01-13T03:20:20Z)

This list is automatically generated from the titles and abstracts of the papers in this site.