Explainable Multi-Agent Reinforcement Learning for Temporal Queries
- URL: http://arxiv.org/abs/2305.10378v1
- Date: Wed, 17 May 2023 17:04:29 GMT
- Title: Explainable Multi-Agent Reinforcement Learning for Temporal Queries
- Authors: Kayla Boggess, Sarit Kraus, and Lu Feng
- Abstract summary: This work presents an approach for generating policy-level contrastive explanations for MARL to answer a temporal user query.
The proposed approach encodes the temporal query as a PCTL logic formula and checks if the query is feasible under a given MARL policy.
The results of a user study show that the generated explanations significantly improve user performance and satisfaction.
- Score: 18.33682005623418
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: As multi-agent reinforcement learning (MARL) systems are increasingly
deployed throughout society, it is imperative yet challenging for users to
understand the emergent behaviors of MARL agents in complex environments. This
work presents an approach for generating policy-level contrastive explanations
for MARL to answer a temporal user query, which specifies a sequence of tasks
completed by agents with possible cooperation. The proposed approach encodes
the temporal query as a PCTL logic formula and checks if the query is feasible
under a given MARL policy via probabilistic model checking. Such explanations
can help reconcile discrepancies between the actual and anticipated multi-agent
behaviors. The proposed approach also generates correct and complete
explanations to pinpoint reasons that make a user query infeasible. We have
successfully applied the proposed approach to four benchmark MARL domains (up
to 9 agents in one domain). Moreover, the results of a user study show that the
generated explanations significantly improve user performance and satisfaction.
Related papers
- ProductAgent: Benchmarking Conversational Product Search Agent with Asking Clarification Questions [68.81939215223818]
ProductAgent is a conversational information seeking agent equipped with abilities of strategic clarification question generation and dynamic product retrieval.
We develop the agent with strategies for product feature summarization, query generation, and product retrieval.
Experiments show that ProductAgent interacts positively with the user and enhances retrieval performance with increasing dialogue turns.
arXiv Detail & Related papers (2024-07-01T03:50:23Z) - Efficient Adaptation in Mixed-Motive Environments via Hierarchical Opponent Modeling and Planning [51.52387511006586]
We propose Hierarchical Opponent modeling and Planning (HOP), a novel multi-agent decision-making algorithm.
HOP is hierarchically composed of two modules: an opponent modeling module that infers others' goals and learns corresponding goal-conditioned policies.
HOP exhibits superior few-shot adaptation capabilities when interacting with various unseen agents, and excels in self-play scenarios.
arXiv Detail & Related papers (2024-06-12T08:48:06Z) - Adaptive-RAG: Learning to Adapt Retrieval-Augmented Large Language Models through Question Complexity [59.57065228857247]
Retrieval-augmented Large Language Models (LLMs) have emerged as a promising approach to enhancing response accuracy in several tasks, such as Question-Answering (QA)
We propose a novel adaptive QA framework, that can dynamically select the most suitable strategy for (retrieval-augmented) LLMs based on the query complexity.
We validate our model on a set of open-domain QA datasets, covering multiple query complexities, and show that ours enhances the overall efficiency and accuracy of QA systems.
arXiv Detail & Related papers (2024-03-21T13:52:30Z) - Entropy-Regularized Token-Level Policy Optimization for Language Agent Reinforcement [67.1393112206885]
Large Language Models (LLMs) have shown promise as intelligent agents in interactive decision-making tasks.
We introduce Entropy-Regularized Token-level Policy Optimization (ETPO), an entropy-augmented RL method tailored for optimizing LLMs at the token level.
We assess the effectiveness of ETPO within a simulated environment that models data science code generation as a series of multi-step interactive tasks.
arXiv Detail & Related papers (2024-02-09T07:45:26Z) - On Diagnostics for Understanding Agent Training Behaviour in Cooperative
MARL [5.124364759305485]
We argue that relying solely on the empirical returns may obscure crucial insights into agent behaviour.
In this paper, we explore the application of explainable AI (XAI) tools to gain profound insights into agent behaviour.
arXiv Detail & Related papers (2023-12-13T19:10:10Z) - Formally Specifying the High-Level Behavior of LLM-Based Agents [24.645319505305316]
LLMs have emerged as promising tools for solving challenging problems without the need for task-specific finetuned models.
Currently, the design and implementation of such agents is ad hoc, as the wide variety of tasks that LLM-based agents may be applied to naturally means there can be no one-size-fits-all approach to agent design.
We propose a minimalistic generation framework that simplifies the process of building agents.
arXiv Detail & Related papers (2023-10-12T17:24:15Z) - ASQ-IT: Interactive Explanations for Reinforcement-Learning Agents [7.9603223299524535]
We present ASQ-IT -- an interactive tool that presents video clips of the agent acting in its environment based on queries given by the user that describe temporal properties of behaviors of interest.
Our approach is based on formal methods: queries in ASQ-IT's user interface map to a fragment of Linear Temporal Logic over finite traces (LTLf), which we developed, and our algorithm for query processing is based on automata theory.
arXiv Detail & Related papers (2023-01-24T11:57:37Z) - Successive Prompting for Decomposing Complex Questions [50.00659445976735]
Recent works leverage the capabilities of large language models (LMs) to perform complex question answering in a few-shot setting.
We introduce Successive Prompting'', where we iteratively break down a complex task into a simple task, solve it, and then repeat the process until we get the final solution.
Our best model (with successive prompting) achieves an improvement of 5% absolute F1 on a few-shot version of the DROP dataset.
arXiv Detail & Related papers (2022-12-08T06:03:38Z) - Explainable Reinforcement Learning via Model Transforms [18.385505289067023]
We argue that even if the underlying Markov Decision Process is not fully known, it can nevertheless be exploited to automatically generate explanations.
We suggest using formal MDP abstractions and transforms, previously used in the literature for expediting the search for optimal policies, to automatically produce explanations.
arXiv Detail & Related papers (2022-09-24T13:18:06Z) - Toward Policy Explanations for Multi-Agent Reinforcement Learning [18.33682005623418]
We present novel methods to generate two types of policy explanations for MARL.
Experimental results on three MARL domains demonstrate the scalability of our methods.
A user study shows that the generated explanations significantly improve user performance and increase subjective ratings on metrics such as user satisfaction.
arXiv Detail & Related papers (2022-04-26T20:07:08Z) - Hyper Meta-Path Contrastive Learning for Multi-Behavior Recommendation [61.114580368455236]
User purchasing prediction with multi-behavior information remains a challenging problem for current recommendation systems.
We propose the concept of hyper meta-path to construct hyper meta-paths or hyper meta-graphs to explicitly illustrate the dependencies among different behaviors of a user.
Thanks to the recent success of graph contrastive learning, we leverage it to learn embeddings of user behavior patterns adaptively instead of assigning a fixed scheme to understand the dependencies among different behaviors.
arXiv Detail & Related papers (2021-09-07T04:28:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.