Self-Supervised Discovering of Interpretable Features for Reinforcement
Learning
- URL: http://arxiv.org/abs/2003.07069v4
- Date: Fri, 19 Mar 2021 08:01:48 GMT
- Title: Self-Supervised Discovering of Interpretable Features for Reinforcement
Learning
- Authors: Wenjie Shi and Gao Huang and Shiji Song and Zhuoyuan Wang and Tingyu
Lin and Cheng Wu
- Abstract summary: We propose a self-supervised interpretable framework for deep reinforcement learning.
A self-supervised interpretable network (SSINet) is employed to produce fine-grained attention masks for highlighting task-relevant information.
We verify and evaluate our method on several Atari 2600 games as well as Duckietown, which is a challenging self-driving car simulator environment.
- Score: 40.52278913726904
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Deep reinforcement learning (RL) has recently led to many breakthroughs on a
range of complex control tasks. However, the agent's decision-making process is
generally not transparent. The lack of interpretability hinders the
applicability of RL in safety-critical scenarios. While several methods have
attempted to interpret vision-based RL, most come without detailed explanation
for the agent's behavior. In this paper, we propose a self-supervised
interpretable framework, which can discover interpretable features to enable
easy understanding of RL agents even for non-experts. Specifically, a
self-supervised interpretable network (SSINet) is employed to produce
fine-grained attention masks for highlighting task-relevant information, which
constitutes most evidence for the agent's decisions. We verify and evaluate our
method on several Atari 2600 games as well as Duckietown, which is a
challenging self-driving car simulator environment. The results show that our
method renders empirical evidences about how the agent makes decisions and why
the agent performs well or badly, especially when transferred to novel scenes.
Overall, our method provides valuable insight into the internal decision-making
process of vision-based RL. In addition, our method does not use any external
labelled data, and thus demonstrates the possibility to learn high-quality mask
through a self-supervised manner, which may shed light on new paradigms for
label-free vision learning such as self-supervised segmentation and detection.
Related papers
- Why the Agent Made that Decision: Explaining Deep Reinforcement Learning with Vision Masks [11.068220265247385]
VisionMask is a standalone explanation model trained end-to-end to identify the most critical regions in the agent's visual input that can explain its actions.
It achieves a 14.9% higher insertion accuracy and a 30.08% higher F1-Score in reproducing original actions from selected visual explanations.
arXiv Detail & Related papers (2024-11-25T06:11:46Z) - DEAR: Disentangled Environment and Agent Representations for Reinforcement Learning without Reconstruction [4.813546138483559]
Reinforcement Learning (RL) algorithms can learn robotic control tasks from visual observations, but they often require a large amount of data.
In this paper, we explore how the agent's knowledge of its shape can improve the sample efficiency of visual RL methods.
We propose a novel method, Disentangled Environment and Agent Representations, that uses the segmentation mask of the agent as supervision.
arXiv Detail & Related papers (2024-06-30T09:15:21Z) - Advantage Actor-Critic with Reasoner: Explaining the Agent's Behavior
from an Exploratory Perspective [19.744322603358402]
We propose a novel Advantage Actor-Critic with Reasoner (A2CR)
A2CR automatically generates a more comprehensive and interpretable paradigm for understanding the agent's decision-making process.
It offers a range of functionalities such as purpose-based saliency, early failure detection, and model supervision.
arXiv Detail & Related papers (2023-09-09T07:19:20Z) - Leveraging Reward Consistency for Interpretable Feature Discovery in
Reinforcement Learning [69.19840497497503]
It is argued that the commonly used action matching principle is more like an explanation of deep neural networks (DNNs) than the interpretation of RL agents.
We propose to consider rewards, the essential objective of RL agents, as the essential objective of interpreting RL agents.
We verify and evaluate our method on the Atari 2600 games as well as Duckietown, a challenging self-driving car simulator environment.
arXiv Detail & Related papers (2023-09-04T09:09:54Z) - GANterfactual-RL: Understanding Reinforcement Learning Agents'
Strategies through Visual Counterfactual Explanations [0.7874708385247353]
We propose a novel but simple method to generate counterfactual explanations for RL agents.
Our method is fully model-agnostic and we demonstrate that it outperforms the only previous method in several computational metrics.
arXiv Detail & Related papers (2023-02-24T15:29:43Z) - A Survey on Explainable Reinforcement Learning: Concepts, Algorithms,
Challenges [38.70863329476517]
Reinforcement Learning (RL) is a popular machine learning paradigm where intelligent agents interact with the environment to fulfill a long-term goal.
Despite the encouraging results achieved, the deep neural network-based backbone is widely deemed as a black box that impedes practitioners to trust and employ trained agents in realistic scenarios where high security and reliability are essential.
To alleviate this issue, a large volume of literature devoted to shedding light on the inner workings of the intelligent agents has been proposed, by constructing intrinsic interpretability or post-hoc explainability.
arXiv Detail & Related papers (2022-11-12T13:52:06Z) - Mastering the Unsupervised Reinforcement Learning Benchmark from Pixels [112.63440666617494]
Reinforcement learning algorithms can succeed but require large amounts of interactions between the agent and the environment.
We propose a new method to solve it, using unsupervised model-based RL, for pre-training the agent.
We show robust performance on the Real-Word RL benchmark, hinting at resiliency to environment perturbations during adaptation.
arXiv Detail & Related papers (2022-09-24T14:22:29Z) - Explaining Deep Reinforcement Learning Agents In The Atari Domain
through a Surrogate Model [78.69367679848632]
We describe a lightweight and effective method to derive explanations for deep RL agents.
Our method relies on a transformation of the pixel-based input of the RL agent to an interpretable, percept-like input representation.
We then train a surrogate model, which is itself interpretable, to replicate the behavior of the target, deep RL agent.
arXiv Detail & Related papers (2021-10-07T05:01:44Z) - Explore and Control with Adversarial Surprise [78.41972292110967]
Reinforcement learning (RL) provides a framework for learning goal-directed policies given user-specified rewards.
We propose a new unsupervised RL technique based on an adversarial game which pits two policies against each other to compete over the amount of surprise an RL agent experiences.
We show that our method leads to the emergence of complex skills by exhibiting clear phase transitions.
arXiv Detail & Related papers (2021-07-12T17:58:40Z) - Explainability in Deep Reinforcement Learning [68.8204255655161]
We review recent works in the direction to attain Explainable Reinforcement Learning (XRL)
In critical situations where it is essential to justify and explain the agent's behaviour, better explainability and interpretability of RL models could help gain scientific insight on the inner workings of what is still considered a black box.
arXiv Detail & Related papers (2020-08-15T10:11:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.