Adaptive Collaboration Strategy for LLMs in Medical Decision Making
- URL: http://arxiv.org/abs/2404.15155v1
- Date: Mon, 22 Apr 2024 06:30:05 GMT
- Title: Adaptive Collaboration Strategy for LLMs in Medical Decision Making
- Authors: Yubin Kim, Chanwoo Park, Hyewon Jeong, Yik Siu Chan, Xuhai Xu, Daniel McDuff, Cynthia Breazeal, Hae Won Park,
- Abstract summary: Our framework, Medical Decision-making Agents (MDAgents) aims to address this gap by automatically assigning the effective collaboration structure for LLMs.
assigned solo or group collaboration structure is tailored to the complexity of the medical task at hand, emulating real-world medical decision making processes.
MDAgents achieves the best performance in 5 out of 7 benchmarks that require an understanding of multi-modal medical reasoning.
- Score: 40.979954284814895
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Foundation models have become invaluable in advancing the medical field. Despite their promise, the strategic deployment of LLMs for effective utility in complex medical tasks remains an open question. Our novel framework, Medical Decision-making Agents (MDAgents) aims to address this gap by automatically assigning the effective collaboration structure for LLMs. Assigned solo or group collaboration structure is tailored to the complexity of the medical task at hand, emulating real-world medical decision making processes. We evaluate our framework and baseline methods with state-of-the-art LLMs across a suite of challenging medical benchmarks: MedQA, MedMCQA, PubMedQA, DDXPlus, PMC-VQA, Path-VQA, and MedVidQA, achieving the best performance in 5 out of 7 benchmarks that require an understanding of multi-modal medical reasoning. Ablation studies reveal that MDAgents excels in adapting the number of collaborating agents to optimize efficiency and accuracy, showcasing its robustness in diverse scenarios. We also explore the dynamics of group consensus, offering insights into how collaborative agents could behave in complex clinical team dynamics. Our code can be found at https://github.com/mitmedialab/MDAgents.
Related papers
- Enhancing Biomedical Knowledge Retrieval-Augmented Generation with Self-Rewarding Tree Search and Proximal Policy Optimization [50.26966969163348]
Large Language Models (LLMs) have shown great potential in the biomedical domain with the advancement of retrieval-augmented generation (RAG)
Existing retrieval-augmented approaches face challenges in addressing diverse queries and documents, particularly for medical knowledge queries.
We propose Self-Rewarding Tree Search (SeRTS) based on Monte Carlo Tree Search (MCTS) and a self-rewarding paradigm.
arXiv Detail & Related papers (2024-06-17T06:48:31Z) - AgentClinic: a multimodal agent benchmark to evaluate AI in simulated clinical environments [2.567146936147657]
We present AgentClinic: a benchmark to evaluate large language models (LLMs) in simulated clinical environments.
In our benchmark, the doctor agent must uncover the patient's diagnosis through dialogue and active data collection.
We find that introducing bias leads to large reductions in diagnostic accuracy of the doctor agents, as well as reduced compliance, confidence, and follow-up willingness in patient agents.
arXiv Detail & Related papers (2024-05-13T17:38:53Z) - Exploring LLM Multi-Agents for ICD Coding [15.730751450511333]
We present a novel multi-agent method for ICD coding, which mimics the real-world coding process with five agents.
We show that our proposed multi-agent coding framework substantially improves performance on both common and rare codes.
Our method also matches the state-of-the-art ICD coding methods that require pre-training or fine-tuning.
arXiv Detail & Related papers (2024-04-01T15:17:39Z) - Large Language Model-based Human-Agent Collaboration for Complex Task
Solving [94.3914058341565]
We introduce the problem of Large Language Models (LLMs)-based human-agent collaboration for complex task-solving.
We propose a Reinforcement Learning-based Human-Agent Collaboration method, ReHAC.
This approach includes a policy model designed to determine the most opportune stages for human intervention within the task-solving process.
arXiv Detail & Related papers (2024-02-20T11:03:36Z) - RJUA-MedDQA: A Multimodal Benchmark for Medical Document Question
Answering and Clinical Reasoning [14.366349078707263]
RJUA-MedDQA is a comprehensive benchmark in the field of medical specialization.
This work introduces RJUA-MedDQA, a comprehensive benchmark in the field of medical specialization.
arXiv Detail & Related papers (2024-02-19T06:57:02Z) - Asclepius: A Spectrum Evaluation Benchmark for Medical Multi-Modal Large
Language Models [59.60384461302662]
We introduce Asclepius, a novel benchmark for evaluating Medical Multi-Modal Large Language Models (Med-MLLMs)
Asclepius rigorously and comprehensively assesses model capability in terms of distinct medical specialties and different diagnostic capacities.
We also provide an in-depth analysis of 6 Med-MLLMs and compare them with 5 human specialists.
arXiv Detail & Related papers (2024-02-17T08:04:23Z) - AI Hospital: Benchmarking Large Language Models in a Multi-agent Medical Interaction Simulator [69.51568871044454]
We introduce textbfAI Hospital, a framework simulating dynamic medical interactions between emphDoctor as player and NPCs.
This setup allows for realistic assessments of LLMs in clinical scenarios.
We develop the Multi-View Medical Evaluation benchmark, utilizing high-quality Chinese medical records and NPCs.
arXiv Detail & Related papers (2024-02-15T06:46:48Z) - MedAgents: Large Language Models as Collaborators for Zero-shot Medical Reasoning [35.804520192679874]
Large language models (LLMs) encounter significant barriers in medicine and healthcare.
We propose MedAgents, a novel multi-disciplinary collaboration framework for the medical domain.
Our work focuses on the zero-shot setting, which is applicable in real-world scenarios.
arXiv Detail & Related papers (2023-11-16T11:47:58Z) - Towards Medical Artificial General Intelligence via Knowledge-Enhanced
Multimodal Pretraining [121.89793208683625]
Medical artificial general intelligence (MAGI) enables one foundation model to solve different medical tasks.
We propose a new paradigm called Medical-knedge-enhanced mulTimOdal pretRaining (MOTOR)
arXiv Detail & Related papers (2023-04-26T01:26:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.