Explainable Reinforcement Learning via Model Transforms
- URL: http://arxiv.org/abs/2209.12006v1
- Date: Sat, 24 Sep 2022 13:18:06 GMT
- Title: Explainable Reinforcement Learning via Model Transforms
- Authors: Mira Finkelstein, Lucy Liu, Nitsan Levy Schlot, Yoav Kolumbus, David
C. Parkes, Jeffrey S. Rosenshein and Sarah Keren
- Abstract summary: We argue that even if the underlying Markov Decision Process is not fully known, it can nevertheless be exploited to automatically generate explanations.
We suggest using formal MDP abstractions and transforms, previously used in the literature for expediting the search for optimal policies, to automatically produce explanations.
- Score: 18.385505289067023
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Understanding emerging behaviors of reinforcement learning (RL) agents may be
difficult since such agents are often trained in complex environments using
highly complex decision making procedures. This has given rise to a variety of
approaches to explainability in RL that aim to reconcile discrepancies that may
arise between the behavior of an agent and the behavior that is anticipated by
an observer. Most recent approaches have relied either on domain knowledge,
that may not always be available, on an analysis of the agent's policy, or on
an analysis of specific elements of the underlying environment, typically
modeled as a Markov Decision Process (MDP). Our key claim is that even if the
underlying MDP is not fully known (e.g., the transition probabilities have not
been accurately learned) or is not maintained by the agent (i.e., when using
model-free methods), it can nevertheless be exploited to automatically generate
explanations. For this purpose, we suggest using formal MDP abstractions and
transforms, previously used in the literature for expediting the search for
optimal policies, to automatically produce explanations. Since such transforms
are typically based on a symbolic representation of the environment, they may
represent meaningful explanations for gaps between the anticipated and actual
agent behavior. We formally define this problem, suggest a class of transforms
that can be used for explaining emergent behaviors, and suggest methods that
enable efficient search for an explanation. We demonstrate the approach on a
set of standard benchmarks.
Related papers
- Demystifying Reinforcement Learning in Production Scheduling via Explainable AI [0.7515066610159392]
Deep Reinforcement Learning (DRL) is a frequently employed technique to solve scheduling problems.
Although DRL agents ace at delivering viable results in short computing times, their reasoning remains opaque.
We apply two explainable AI (xAI) frameworks to describe the reasoning behind scheduling decisions of a specialized DRL agent in a flow production.
arXiv Detail & Related papers (2024-08-19T09:39:01Z) - Understanding Your Agent: Leveraging Large Language Models for Behavior
Explanation [7.647395374489533]
We propose an approach to generate natural language explanations for an agent's behavior based only on observations of states and actions.
We show that our approach generates explanations as helpful as those produced by a human domain expert.
arXiv Detail & Related papers (2023-11-29T20:16:23Z) - Inverse Decision Modeling: Learning Interpretable Representations of
Behavior [72.80902932543474]
We develop an expressive, unifying perspective on inverse decision modeling.
We use this to formalize the inverse problem (as a descriptive model)
We illustrate how this structure enables learning (interpretable) representations of (bounded) rationality.
arXiv Detail & Related papers (2023-10-28T05:05:01Z) - Explainable Multi-Agent Reinforcement Learning for Temporal Queries [18.33682005623418]
This work presents an approach for generating policy-level contrastive explanations for MARL to answer a temporal user query.
The proposed approach encodes the temporal query as a PCTL logic formula and checks if the query is feasible under a given MARL policy.
The results of a user study show that the generated explanations significantly improve user performance and satisfaction.
arXiv Detail & Related papers (2023-05-17T17:04:29Z) - GANterfactual-RL: Understanding Reinforcement Learning Agents'
Strategies through Visual Counterfactual Explanations [0.7874708385247353]
We propose a novel but simple method to generate counterfactual explanations for RL agents.
Our method is fully model-agnostic and we demonstrate that it outperforms the only previous method in several computational metrics.
arXiv Detail & Related papers (2023-02-24T15:29:43Z) - Learning How to Infer Partial MDPs for In-Context Adaptation and
Exploration [17.27164535440641]
Posterior sampling is a promising approach, but it requires Bayesian inference and dynamic programming.
We show that even though partial models exclude relevant information from the environment, they can nevertheless lead to good policies.
arXiv Detail & Related papers (2023-02-08T18:35:24Z) - Explainability in Process Outcome Prediction: Guidelines to Obtain
Interpretable and Faithful Models [77.34726150561087]
We define explainability through the interpretability of the explanations and the faithfulness of the explainability model in the field of process outcome prediction.
This paper contributes a set of guidelines named X-MOP which allows selecting the appropriate model based on the event log specifications.
arXiv Detail & Related papers (2022-03-30T05:59:50Z) - Inverse Online Learning: Understanding Non-Stationary and Reactionary
Policies [79.60322329952453]
We show how to develop interpretable representations of how agents make decisions.
By understanding the decision-making processes underlying a set of observed trajectories, we cast the policy inference problem as the inverse to this online learning problem.
We introduce a practical algorithm for retrospectively estimating such perceived effects, alongside the process through which agents update them.
Through application to the analysis of UNOS organ donation acceptance decisions, we demonstrate that our approach can bring valuable insights into the factors that govern decision processes and how they change over time.
arXiv Detail & Related papers (2022-03-14T17:40:42Z) - Explaining Reinforcement Learning Policies through Counterfactual
Trajectories [147.7246109100945]
A human developer must validate that an RL agent will perform well at test-time.
Our method conveys how the agent performs under distribution shifts by showing the agent's behavior across a wider trajectory distribution.
In a user study, we demonstrate that our method enables users to score better than baseline methods on one of two agent validation tasks.
arXiv Detail & Related papers (2022-01-29T00:52:37Z) - What is Going on Inside Recurrent Meta Reinforcement Learning Agents? [63.58053355357644]
Recurrent meta reinforcement learning (meta-RL) agents are agents that employ a recurrent neural network (RNN) for the purpose of "learning a learning algorithm"
We shed light on the internal working mechanisms of these agents by reformulating the meta-RL problem using the Partially Observable Markov Decision Process (POMDP) framework.
arXiv Detail & Related papers (2021-04-29T20:34:39Z) - Plannable Approximations to MDP Homomorphisms: Equivariance under
Actions [72.30921397899684]
We introduce a contrastive loss function that enforces action equivariance on the learned representations.
We prove that when our loss is zero, we have a homomorphism of a deterministic Markov Decision Process.
We show experimentally that for deterministic MDPs, the optimal policy in the abstract MDP can be successfully lifted to the original MDP.
arXiv Detail & Related papers (2020-02-27T08:29:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.