TxAgent: An AI Agent for Therapeutic Reasoning Across a Universe of Tools
- URL: http://arxiv.org/abs/2503.10970v1
- Date: Fri, 14 Mar 2025 00:28:15 GMT
- Title: TxAgent: An AI Agent for Therapeutic Reasoning Across a Universe of Tools
- Authors: Shanghua Gao, Richard Zhu, Zhenglun Kong, Ayush Noori, Xiaorui Su, Curtis Ginder, Theodoros Tsiligkaridis, Marinka Zitnik,
- Abstract summary: TxAgent is an AI agent that analyzes drug interactions, contraindications, and patient-specific treatment strategies.<n>The ToolUniverse consolidates 211 tools from trusted sources, including all US FDA-approved drugs since 1939.<n>It achieves 92.1% accuracy in open-ended drug reasoning tasks, surpassing GPT-4o and outperforming DeepSeek-R1 (671B) in structured multi-step reasoning.
- Score: 22.322166889507184
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Precision therapeutics require multimodal adaptive models that generate personalized treatment recommendations. We introduce TxAgent, an AI agent that leverages multi-step reasoning and real-time biomedical knowledge retrieval across a toolbox of 211 tools to analyze drug interactions, contraindications, and patient-specific treatment strategies. TxAgent evaluates how drugs interact at molecular, pharmacokinetic, and clinical levels, identifies contraindications based on patient comorbidities and concurrent medications, and tailors treatment strategies to individual patient characteristics. It retrieves and synthesizes evidence from multiple biomedical sources, assesses interactions between drugs and patient conditions, and refines treatment recommendations through iterative reasoning. It selects tools based on task objectives and executes structured function calls to solve therapeutic tasks that require clinical reasoning and cross-source validation. The ToolUniverse consolidates 211 tools from trusted sources, including all US FDA-approved drugs since 1939 and validated clinical insights from Open Targets. TxAgent outperforms leading LLMs, tool-use models, and reasoning agents across five new benchmarks: DrugPC, BrandPC, GenericPC, TreatmentPC, and DescriptionPC, covering 3,168 drug reasoning tasks and 456 personalized treatment scenarios. It achieves 92.1% accuracy in open-ended drug reasoning tasks, surpassing GPT-4o and outperforming DeepSeek-R1 (671B) in structured multi-step reasoning. TxAgent generalizes across drug name variants and descriptions. By integrating multi-step inference, real-time knowledge grounding, and tool-assisted decision-making, TxAgent ensures that treatment recommendations align with established clinical guidelines and real-world evidence, reducing the risk of adverse events and improving therapeutic decision-making.
Related papers
- EvoClinician: A Self-Evolving Agent for Multi-Turn Medical Diagnosis via Test-Time Evolutionary Learning [72.70291772077738]
We propose Med-Inquire, a new benchmark designed to evaluate an agent's ability to perform multi-turn diagnosis.<n>We then introduce EvoClinician, a self-evolving agent that learns efficient diagnostic strategies at test time.<n>Our experiments show EvoClinician outperforms continual learning baselines and other self-evolving agents like memory agents.
arXiv Detail & Related papers (2026-01-30T13:26:18Z) - MedAI: Evaluating TxAgent's Therapeutic Agentic Reasoning in the NeurIPS CURE-Bench Competition [6.191248426050678]
Therapeutic decision-making in clinical medicine requires robust, multi-step reasoning grounded in reliable biomedical knowledge.<n>Agentic AI methods, exemplified by TxAgent, address these challenges through iterative retrieval-augmented generation (RAG)<n>This work presents insights derived from our participation in the CURE-Bench NeurIPS 2025 Challenge, which benchmarks therapeutic-reasoning systems.
arXiv Detail & Related papers (2025-12-12T16:01:48Z) - Lessons Learned from Evaluation of LLM based Multi-agents in Safer Therapy Recommendation [9.84660526673816]
This study investigated the feasibility and value of using a Large Language Model (LLM)-based multi-agent system for safer therapy recommendations.<n>We designed a single agent and a MAS framework simulating multidisciplinary team (MDT) decision-making.<n>We compared MAS performance with single-agent approaches and real-world benchmarks.
arXiv Detail & Related papers (2025-07-15T02:01:38Z) - An Agentic System for Rare Disease Diagnosis with Traceable Reasoning [58.78045864541539]
We introduce DeepRare, the first rare disease diagnosis agentic system powered by a large language model (LLM)<n>DeepRare generates ranked diagnostic hypotheses for rare diseases, each accompanied by a transparent chain of reasoning.<n>The system demonstrates exceptional diagnostic performance among 2,919 diseases, achieving 100% accuracy for 1013 diseases.
arXiv Detail & Related papers (2025-06-25T13:42:26Z) - MMedAgent-RL: Optimizing Multi-Agent Collaboration for Multimodal Medical Reasoning [63.63542462400175]
We propose MMedAgent-RL, a reinforcement learning-based multi-agent framework that enables dynamic, optimized collaboration among medical agents.<n> Specifically, we train two GP agents based on Qwen2.5-VL via RL: the triage doctor learns to assign patients to appropriate specialties, while the attending physician integrates the judgments from multi-specialists.<n>Experiments on five medical VQA benchmarks demonstrate that MMedAgent-RL not only outperforms both open-source and proprietary Med-LVLMs, but also exhibits human-like reasoning patterns.
arXiv Detail & Related papers (2025-05-31T13:22:55Z) - DrugPilot: LLM-based Parameterized Reasoning Agent for Drug Discovery [54.79763887844838]
Large language models (LLMs) integrated with autonomous agents hold significant potential for advancing scientific discovery through automated reasoning and task execution.<n>We introduce DrugPilot, a LLM-based agent system with a parameterized reasoning architecture designed for end-to-end scientific in drug discovery.<n>DrugPilot significantly outperforms state-of-the-art agents such as ReAct and LoT, achieving task completion rates of 98.0%, 93.5%, and 64.0% for simple, multi-tool, and multi-turn scenarios, respectively.
arXiv Detail & Related papers (2025-05-20T05:18:15Z) - TAMA: A Human-AI Collaborative Thematic Analysis Framework Using Multi-Agent LLMs for Clinical Interviews [54.35097932763878]
Thematic analysis (TA) is a widely used qualitative approach for uncovering latent meanings in unstructured text data.
Here, we propose TAMA: A Human-AI Collaborative Thematic Analysis framework using Multi-Agent LLMs for clinical interviews.
We demonstrate that TAMA outperforms existing LLM-assisted TA approaches, achieving higher thematic hit rate, coverage, and distinctiveness.
arXiv Detail & Related papers (2025-03-26T15:58:16Z) - Towards Conversational AI for Disease Management [29.189384095061722]
Articulate Medical Intelligence Explorer (AMIE) is an agentic system optimised for clinical management and dialogue.<n>AMIE is non-inferior to PCPs in management reasoning as assessed by specialist physicians.<n>AMIE's strong performance across evaluations marks a significant step towards conversational AI as a tool in disease management.
arXiv Detail & Related papers (2025-03-08T05:48:58Z) - Collaborative Expert LLMs Guided Multi-Objective Molecular Optimization [51.104444856052204]
We present MultiMol, a collaborative large language model (LLM) system designed to guide multi-objective molecular optimization.<n>In evaluations across six multi-objective optimization tasks, MultiMol significantly outperforms existing methods, achieving a 82.30% success rate.
arXiv Detail & Related papers (2025-03-05T13:47:55Z) - Natural Language-Assisted Multi-modal Medication Recommendation [97.07805345563348]
We introduce the Natural Language-Assisted Multi-modal Medication Recommendation(NLA-MMR)<n>The NLA-MMR is a multi-modal alignment framework designed to learn knowledge from the patient view and medication view jointly.<n>In this vein, we employ pretrained language models(PLMs) to extract in-domain knowledge regarding patients and medications.
arXiv Detail & Related papers (2025-01-13T09:51:50Z) - Towards Next-Generation Medical Agent: How o1 is Reshaping Decision-Making in Medical Scenarios [46.729092855387165]
We study the choice of the backbone LLM for medical AI agents, which is the foundation for the agent's overall reasoning and action generation.<n>Our findings demonstrate o1's ability to enhance diagnostic accuracy and consistency, paving the way for smarter, more responsive AI tools.
arXiv Detail & Related papers (2024-11-16T18:19:53Z) - DrugAgent: Explainable Drug Repurposing Agent with Large Language Model-based Reasoning [10.528489471229946]
We propose a multi-agent framework to enhance the drug repurposing process using state-of-the-art machine learning techniques and knowledge integration.
Our framework comprises several specialized agents: an AI Agent trains robust drug-target interaction (DTI) models; a Knowledge Graph Agent utilizes the drug-gene interaction database (DGIdb) to systematically extract DTIs.
By integrating outputs from these agents, our system effectively harnesses diverse data sources, including external databases, to propose viable repurposing candidates.
arXiv Detail & Related papers (2024-08-23T21:24:59Z) - Empowering Clinicians with Medical Decision Transformers: A Framework for Sepsis Treatment [5.0005174003014865]
We propose the medical decision transformer (MeDT) to solve tasks in safety-critical settings.
MeDT uses the decision transformer architecture to learn a policy for drug dosage recommendation.
MeDT captures complex dependencies among a patient's medical history, treatment decisions, outcomes, and short-term effects on stability.
arXiv Detail & Related papers (2024-07-28T03:40:00Z) - MDAgents: An Adaptive Collaboration of LLMs for Medical Decision-Making [45.74980058831342]
We introduce a novel multi-agent framework, named Medical Decision-making Agents (MDAgents)
The assigned solo or group collaboration structure is tailored to the medical task at hand, emulating real-world medical decision-making processes.
MDAgents achieved the best performance in seven out of ten benchmarks on tasks requiring an understanding of medical knowledge.
arXiv Detail & Related papers (2024-04-22T06:30:05Z) - AgentMD: Empowering Language Agents for Risk Prediction with Large-Scale
Clinical Tool Learning [11.8292941452582]
We introduce AgentMD, a novel language agent capable of curating and applying clinical calculators across various clinical contexts.
AgentMD has automatically curated a collection of 2,164 diverse clinical calculators with executable functions and structured documentation, collectively named RiskCalcs.
Manual evaluations show that RiskCalcs tools achieve an accuracy of over 80% on three quality metrics.
arXiv Detail & Related papers (2024-02-20T18:37:19Z) - RECOMED: A Comprehensive Pharmaceutical Recommendation System [8.681590862953623]
A pharmaceutical recommendation system was designed based on the patients and drugs features extracted from Drugs.com and Druglib.com.
To the best of our knowledge, we are the first group to consider patients conditions and history in the proposed approach for selecting a specific medicine appropriate for that particular user.
arXiv Detail & Related papers (2022-12-31T20:04:31Z) - Conditional Generation Net for Medication Recommendation [73.09366442098339]
Medication recommendation targets to provide a proper set of medicines according to patients' diagnoses, which is a critical task in clinics.
We propose Conditional Generation Net (COGNet) which introduces a novel copy-or-predict mechanism to generate the set of medicines.
We validate the proposed model on the public MIMIC data set, and the experimental results show that the proposed model can outperform state-of-the-art approaches.
arXiv Detail & Related papers (2022-02-14T10:16:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.