Related papers: TalkToAgent: A Human-centric Explanation of Reinforcement Learning Agents with Large Language Models

TalkToAgent: A Human-centric Explanation of Reinforcement Learning Agents with Large Language Models

URL: http://arxiv.org/abs/2509.04809v2
Date: Mon, 08 Sep 2025 00:52:15 GMT
Title: TalkToAgent: A Human-centric Explanation of Reinforcement Learning Agents with Large Language Models
Authors: Haechang Kim, Hao Chen, Can Li, Jong Min Lee,
Abstract summary: We introduce TalkToAgent, a framework that delivers interactive, natural language explanations for Reinforcement Learning policies.<n>The architecture with five specialized agents (Coordinator, Explainer, Coder, Evaluator, and Debugger) enables TalkToAgent to automatically map user queries to relevant XRL tools.
Score: 15.125981288047546
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Explainable Reinforcement Learning (XRL) has emerged as a promising approach in improving the transparency of Reinforcement Learning (RL) agents. However, there remains a gap between complex RL policies and domain experts, due to the limited comprehensibility of XRL results and isolated coverage of current XRL approaches that leave users uncertain about which tools to employ. To address these challenges, we introduce TalkToAgent, a multi-agent Large Language Models (LLM) framework that delivers interactive, natural language explanations for RL policies. The architecture with five specialized LLM agents (Coordinator, Explainer, Coder, Evaluator, and Debugger) enables TalkToAgent to automatically map user queries to relevant XRL tools and clarify an agent's actions in terms of either key state variables, expected outcomes, or counterfactual explanations. Moreover, our approach extends previous counterfactual explanations by deriving alternative scenarios from qualitative behavioral descriptions, or even new rule-based policies. We validated TalkToAgent on quadruple-tank process control problem, a well-known nonlinear control benchmark. Results demonstrated that TalkToAgent successfully mapped user queries into XRL tasks with high accuracy, and coder-debugger interactions minimized failures in counterfactual generation. Furthermore, qualitative evaluation confirmed that TalkToAgent effectively interpreted agent's actions and contextualized their meaning within the problem domain.

Related papers

Pushing Forward Pareto Frontiers of Proactive Agents with Behavioral Agentic Optimization [61.641777037967366]
Proactive large language model (LLM) agents aim to actively plan, query, and interact over multiple turns.<n>Agentic reinforcement learning (RL) has emerged as a promising solution for training such agents in multi-turn settings.<n>We propose BAO, an agentic RL framework that combines behavior enhancement to enrich proactive reasoning and information-gathering capabilities.
arXiv Detail & Related papers (2026-02-11T20:40:43Z)
Agent0: Unleashing Self-Evolving Agents from Zero Data via Tool-Integrated Reasoning [84.70211451226835]
Large Language Model (LLM) Agents are constrained by a dependency on human-curated data.<n>We introduce Agent0, a fully autonomous framework that evolves high-performing agents without external data.<n>Agent0 substantially boosts reasoning capabilities, improving the Qwen3-8B-Base model by 18% on mathematical reasoning and 24% on general reasoning benchmarks.
arXiv Detail & Related papers (2025-11-20T05:01:57Z)
Interact-RAG: Reason and Interact with the Corpus, Beyond Black-Box Retrieval [49.85856484781787]
We introduce Interact-RAG, a new paradigm that elevates the LLM agent into an active manipulator of the retrieval process.<n>We develop a reasoning-enhanced workflow, which enables both zero-shot execution and the synthesis of interaction trajectories.<n>Experiments across six benchmarks demonstrate that Interact-RAG significantly outperforms other advanced methods.
arXiv Detail & Related papers (2025-10-31T15:48:43Z)
QAgent: A modular Search Agent with Interactive Query Understanding [25.147900132089777]
Large language models excel at natural language tasks but are limited by their static parametric knowledge.<n>We propose a unified agentic RAG framework that employs a search agent for adaptive retrieval.<n> Experiments show QAgent excels at QA and serves as a plug-and-play module for real-world deployment.
arXiv Detail & Related papers (2025-10-09T16:08:05Z)
AgentRouter: A Knowledge-Graph-Guided LLM Router for Collaborative Multi-Agent Question Answering [51.07491603393163]
tAgent is a framework that formulates multi-agent QA as a knowledge-graph-guided routing problem supervised by empirical performance signals.<n>By leveraging soft supervision and weighted aggregation of agent outputs, Agent learns principled collaboration schemes that capture the complementary strengths of diverse agents.
arXiv Detail & Related papers (2025-10-06T23:20:49Z)
AgentGym-RL: Training LLM Agents for Long-Horizon Decision Making through Multi-Turn Reinforcement Learning [129.44038804430542]
We introduce AgentGym-RL, a new framework to train LLM agents for multi-turn interactive decision-making through RL.<n>We propose ScalingInter-RL, a training approach designed for exploration-exploitation balance and stable RL optimization.<n>Our agents match or surpass commercial models on 27 tasks across diverse environments.
arXiv Detail & Related papers (2025-09-10T16:46:11Z)
AgentVigil: Generic Black-Box Red-teaming for Indirect Prompt Injection against LLM Agents [54.29555239363013]
We propose a generic black-box fuzzing framework, AgentVigil, to automatically discover and exploit indirect prompt injection vulnerabilities.<n>We evaluate AgentVigil on two public benchmarks, AgentDojo and VWA-adv, where it achieves 71% and 70% success rates against agents based on o3-mini and GPT-4o.<n>We apply our attacks in real-world environments, successfully misleading agents to navigate to arbitrary URLs, including malicious sites.
arXiv Detail & Related papers (2025-05-09T07:40:17Z)
SOPBench: Evaluating Language Agents at Following Standard Operating Procedures and Constraints [59.645885492637845]
SOPBench is an evaluation pipeline that transforms each service-specific SOP code program into a directed graph of executable functions.<n>Our approach transforms each service-specific SOP code program into a directed graph of executable functions and requires agents to call these functions based on natural language SOP descriptions.<n>We evaluate 18 leading models, and results show the task is challenging even for top-tier models.
arXiv Detail & Related papers (2025-03-11T17:53:02Z)
XAgents: A Framework for Interpretable Rule-Based Multi-Agents Cooperation [4.872631281771186]
XAgents is an in-terpretable multi-agent cooperative framework based on the IF-THEN rule-based system. XAgetns transmits the task to the disparate domain rules, which subsequently generate the various responses. The incorporation of rule-based interpretability serves to bolster user confidence in the XAgents framework.
arXiv Detail & Related papers (2024-11-21T08:28:27Z)
Textualized Agent-Style Reasoning for Complex Tasks by Multiple Round LLM Generation [49.27250832754313]
We present AgentCOT, a llm-based autonomous agent framework. At each step, AgentCOT selects an action and executes it to yield an intermediate result with supporting evidence. We introduce two new strategies to enhance the performance of AgentCOT.
arXiv Detail & Related papers (2024-09-19T02:20:06Z)
Semifactual Explanations for Reinforcement Learning [1.5320737596132754]
Reinforcement Learning (RL) is a learning paradigm in which the agent learns from its environment through trial and error. Deep reinforcement learning (DRL) algorithms represent the agent's policies using neural networks, making their decisions difficult to interpret. Explaining the behaviour of DRL agents is necessary to advance user trust, increase engagement, and facilitate integration with real-life tasks.
arXiv Detail & Related papers (2024-09-09T08:37:47Z)
GANterfactual-RL: Understanding Reinforcement Learning Agents' Strategies through Visual Counterfactual Explanations [0.7874708385247353]
We propose a novel but simple method to generate counterfactual explanations for RL agents. Our method is fully model-agnostic and we demonstrate that it outperforms the only previous method in several computational metrics.
arXiv Detail & Related papers (2023-02-24T15:29:43Z)
A Survey on Explainable Reinforcement Learning: Concepts, Algorithms, Challenges [51.699348215510575]
Reinforcement Learning (RL) is a popular machine learning paradigm where intelligent agents interact with the environment to fulfill a long-term goal.<n>Despite the encouraging results achieved, the deep neural network-based backbone is widely deemed as a black box that impedes practitioners to trust and employ trained agents in realistic scenarios where high security and reliability are essential.<n>To alleviate this issue, a large volume of literature devoted to shedding light on the inner workings of the intelligent agents has been proposed, by constructing intrinsic interpretability or post-hoc explainability.
arXiv Detail & Related papers (2022-11-12T13:52:06Z)
Explaining Reinforcement Learning Policies through Counterfactual Trajectories [147.7246109100945]
A human developer must validate that an RL agent will perform well at test-time. Our method conveys how the agent performs under distribution shifts by showing the agent's behavior across a wider trajectory distribution. In a user study, we demonstrate that our method enables users to score better than baseline methods on one of two agent validation tasks.
arXiv Detail & Related papers (2022-01-29T00:52:37Z)
Explainable Reinforcement Learning for Broad-XAI: A Conceptual Framework and Survey [0.7366405857677226]
Reinforcement Learning (RL) methods provide a potential backbone for the cognitive model required for the development of Broad-XAI. RL represents a suite of approaches that have had increasing success in solving a range of sequential decision-making problems. This paper aims to introduce a conceptual framework, called the Causal XRL Framework (CXF), that unifies the current XRL research and uses RL as a backbone to the development of Broad-XAI.
arXiv Detail & Related papers (2021-08-20T05:18:50Z)
Believe What You See: Implicit Constraint Approach for Offline Multi-Agent Reinforcement Learning [16.707045765042505]
Current offline RL algorithms are ineffective in multi-agent systems due to the accumulated extrapolation error. We propose a novel offline RL algorithm, named Implicit Constraint Q-learning (ICQ), which effectively alleviates the extrapolation error. Experimental results demonstrate that the extrapolation error is reduced to almost zero and insensitive to the number of agents.
arXiv Detail & Related papers (2021-06-07T08:02:31Z)

This list is automatically generated from the titles and abstracts of the papers in this site.