Policy Regularization for Legible Behavior
- URL: http://arxiv.org/abs/2203.04303v1
- Date: Tue, 8 Mar 2022 10:55:46 GMT
- Title: Policy Regularization for Legible Behavior
- Authors: Michele Persiani, Thomas Hellstr\"om
- Abstract summary: In Reinforcement Learning interpretability generally means to provide insight into the agent's mechanisms.
This paper borrows from the Explainable Planning literature methods that focus on the legibility of the agent.
In our formulation, the decision boundary introduced by legibility impacts the states in which the agent's policy returns an action that has high likelihood also in other policies.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In Reinforcement Learning interpretability generally means to provide insight
into the agent's mechanisms such that its decisions are understandable by an
expert upon inspection. This definition, with the resulting methods from the
literature, may however fall short for online settings where the fluency of
interactions prohibits deep inspections of the decision-making algorithm. To
support interpretability in online settings it is useful to borrow from the
Explainable Planning literature methods that focus on the legibility of the
agent, by making its intention easily discernable in an observer model. As we
propose in this paper, injecting legible behavior inside an agent's policy
doesn't require modify components of its learning algorithm. Rather, the
agent's optimal policy can be regularized for legibility by evaluating how the
policy may produce observations that would make an observer infer an incorrect
policy. In our formulation, the decision boundary introduced by legibility
impacts the states in which the agent's policy returns an action that has high
likelihood also in other policies. In these cases, a trade-off between such
action, and legible/sub-optimal action is made.
Related papers
- Policy Dispersion in Non-Markovian Environment [53.05904889617441]
This paper tries to learn the diverse policies from the history of state-action pairs under a non-Markovian environment.
We first adopt a transformer-based method to learn policy embeddings.
Then, we stack the policy embeddings to construct a dispersion matrix to induce a set of diverse policies.
arXiv Detail & Related papers (2023-02-28T11:58:39Z) - Bounded Robustness in Reinforcement Learning via Lexicographic
Objectives [54.00072722686121]
Policy robustness in Reinforcement Learning may not be desirable at any cost.
We study how policies can be maximally robust to arbitrary observational noise.
We propose a robustness-inducing scheme, applicable to any policy algorithm, that trades off expected policy utility for robustness.
arXiv Detail & Related papers (2022-09-30T08:53:18Z) - "Guess what I'm doing": Extending legibility to sequential decision
tasks [7.352593846694083]
We investigate the notion of legibility in sequential decision tasks under uncertainty.
Our proposed approach, dubbed PoL-MDP, is able to handle uncertainty while remaining computationally tractable.
arXiv Detail & Related papers (2022-09-19T16:01:33Z) - Inverse Online Learning: Understanding Non-Stationary and Reactionary
Policies [79.60322329952453]
We show how to develop interpretable representations of how agents make decisions.
By understanding the decision-making processes underlying a set of observed trajectories, we cast the policy inference problem as the inverse to this online learning problem.
We introduce a practical algorithm for retrospectively estimating such perceived effects, alongside the process through which agents update them.
Through application to the analysis of UNOS organ donation acceptance decisions, we demonstrate that our approach can bring valuable insights into the factors that govern decision processes and how they change over time.
arXiv Detail & Related papers (2022-03-14T17:40:42Z) - Interrogating the Black Box: Transparency through Information-Seeking
Dialogues [9.281671380673306]
We propose to construct an investigator agent to query a learning agent to investigate its adherence to an ethical policy.
This formal dialogue framework is the main contribution of this paper.
We argue that the introduced formal dialogue framework opens many avenues both in the area of compliance checking and in the analysis of properties of opaque systems.
arXiv Detail & Related papers (2021-02-09T09:14:04Z) - Policy Supervectors: General Characterization of Agents by their
Behaviour [18.488655590845163]
We propose policy supervectors for characterizing agents by the distribution of states they visit.
Policy supervectors can characterize policies regardless of their design philosophy and scale to thousands of policies on a single workstation machine.
We demonstrate method's applicability by studying the evolution of policies during reinforcement learning, evolutionary training and imitation learning.
arXiv Detail & Related papers (2020-12-02T14:43:16Z) - Learning "What-if" Explanations for Sequential Decision-Making [92.8311073739295]
Building interpretable parameterizations of real-world decision-making on the basis of demonstrated behavior is essential.
We propose learning explanations of expert decisions by modeling their reward function in terms of preferences with respect to "what if" outcomes.
We highlight the effectiveness of our batch, counterfactual inverse reinforcement learning approach in recovering accurate and interpretable descriptions of behavior.
arXiv Detail & Related papers (2020-07-02T14:24:17Z) - Continuous Action Reinforcement Learning from a Mixture of Interpretable
Experts [35.80418547105711]
We propose a policy scheme that retains a complex function approxor for its internal value predictions but constrains the policy to have a concise, hierarchical, and human-readable structure.
The main technical contribution of the paper is to address the challenges introduced by this non-differentiable state selection procedure.
arXiv Detail & Related papers (2020-06-10T16:02:08Z) - Learning Goal-oriented Dialogue Policy with Opposite Agent Awareness [116.804536884437]
We propose an opposite behavior aware framework for policy learning in goal-oriented dialogues.
We estimate the opposite agent's policy from its behavior and use this estimation to improve the target agent by regarding it as part of the target policy.
arXiv Detail & Related papers (2020-04-21T03:13:44Z) - Preventing Imitation Learning with Adversarial Policy Ensembles [79.81807680370677]
Imitation learning can reproduce policies by observing experts, which poses a problem regarding policy privacy.
How can we protect against external observers cloning our proprietary policies?
We introduce a new reinforcement learning framework, where we train an ensemble of near-optimal policies.
arXiv Detail & Related papers (2020-01-31T01:57:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.