Visual Explanation using Attention Mechanism in Actor-Critic-based Deep
Reinforcement Learning
- URL: http://arxiv.org/abs/2103.04067v1
- Date: Sat, 6 Mar 2021 08:38:12 GMT
- Title: Visual Explanation using Attention Mechanism in Actor-Critic-based Deep
Reinforcement Learning
- Authors: Hidenori Itaya, Tsubasa Hirakawa, Takayoshi Yamashita, Hironobu
Fujiyoshi, Komei Sugiura
- Abstract summary: We propose Mask-Attention A3C (Mask A3C), which introduces an attention mechanism into Asynchronous Advantage Actor-Critic (A3C)
A3C consists of a feature extractor that extracts features from an image, a policy branch that outputs the policy, and a value branch that outputs the state value.
We visualized mask-attention maps for games on the Atari 2600 and found we could easily analyze the reasons behind an agent's decision-making.
- Score: 9.49864824780503
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Deep reinforcement learning (DRL) has great potential for acquiring the
optimal action in complex environments such as games and robot control.
However, it is difficult to analyze the decision-making of the agent, i.e., the
reasons it selects the action acquired by learning. In this work, we propose
Mask-Attention A3C (Mask A3C), which introduces an attention mechanism into
Asynchronous Advantage Actor-Critic (A3C), which is an actor-critic-based DRL
method, and can analyze the decision-making of an agent in DRL. A3C consists of
a feature extractor that extracts features from an image, a policy branch that
outputs the policy, and a value branch that outputs the state value. In this
method, we focus on the policy and value branches and introduce an attention
mechanism into them. The attention mechanism applies a mask processing to the
feature maps of each branch using mask-attention that expresses the judgment
reason for the policy and state value with a heat map. We visualized
mask-attention maps for games on the Atari 2600 and found we could easily
analyze the reasons behind an agent's decision-making in various game tasks.
Furthermore, experimental results showed that the agent could achieve a higher
performance by introducing the attention mechanism.
Related papers
- Why the Agent Made that Decision: Explaining Deep Reinforcement Learning with Vision Masks [11.068220265247385]
VisionMask is a standalone explanation model trained end-to-end to identify the most critical regions in the agent's visual input that can explain its actions.
It achieves a 14.9% higher insertion accuracy and a 30.08% higher F1-Score in reproducing original actions from selected visual explanations.
arXiv Detail & Related papers (2024-11-25T06:11:46Z) - DEAR: Disentangled Environment and Agent Representations for Reinforcement Learning without Reconstruction [4.813546138483559]
Reinforcement Learning (RL) algorithms can learn robotic control tasks from visual observations, but they often require a large amount of data.
In this paper, we explore how the agent's knowledge of its shape can improve the sample efficiency of visual RL methods.
We propose a novel method, Disentangled Environment and Agent Representations, that uses the segmentation mask of the agent as supervision.
arXiv Detail & Related papers (2024-06-30T09:15:21Z) - Agent Attention: On the Integration of Softmax and Linear Attention [70.06472039237354]
We propose a novel attention paradigm, Agent Attention, to strike a favorable balance between computational efficiency and representation power.
We show that the proposed agent attention is equivalent to a generalized form of linear attention.
Notably, agent attention has shown remarkable performance in high-resolution scenarios, owning to its linear attention nature.
arXiv Detail & Related papers (2023-12-14T16:26:29Z) - Advantage Actor-Critic with Reasoner: Explaining the Agent's Behavior
from an Exploratory Perspective [19.744322603358402]
We propose a novel Advantage Actor-Critic with Reasoner (A2CR)
A2CR automatically generates a more comprehensive and interpretable paradigm for understanding the agent's decision-making process.
It offers a range of functionalities such as purpose-based saliency, early failure detection, and model supervision.
arXiv Detail & Related papers (2023-09-09T07:19:20Z) - Action Q-Transformer: Visual Explanation in Deep Reinforcement Learning
with Encoder-Decoder Model using Action Query [7.290230029542328]
Action Q-Transformer (AQT) introduces a transformer encoder-decoder structure to Q-learning based DRL methods.
We show that visualization of attention in Atari 2600 games enables detailed analysis of agents' decision-making in various game tasks.
arXiv Detail & Related papers (2023-06-24T07:06:14Z) - Pessimism meets VCG: Learning Dynamic Mechanism Design via Offline
Reinforcement Learning [114.36124979578896]
We design a dynamic mechanism using offline reinforcement learning algorithms.
Our algorithm is based on the pessimism principle and only requires a mild assumption on the coverage of the offline data set.
arXiv Detail & Related papers (2022-05-05T05:44:26Z) - Automated Machine Learning, Bounded Rationality, and Rational
Metareasoning [62.997667081978825]
We will look at automated machine learning (AutoML) and related problems from the perspective of bounded rationality.
Taking actions under bounded resources requires an agent to reflect on how to use these resources in an optimal way.
arXiv Detail & Related papers (2021-09-10T09:10:20Z) - Explore and Control with Adversarial Surprise [78.41972292110967]
Reinforcement learning (RL) provides a framework for learning goal-directed policies given user-specified rewards.
We propose a new unsupervised RL technique based on an adversarial game which pits two policies against each other to compete over the amount of surprise an RL agent experiences.
We show that our method leads to the emergence of complex skills by exhibiting clear phase transitions.
arXiv Detail & Related papers (2021-07-12T17:58:40Z) - Forgetful Experience Replay in Hierarchical Reinforcement Learning from
Demonstrations [55.41644538483948]
In this paper, we propose a combination of approaches that allow the agent to use low-quality demonstrations in complex vision-based environments.
Our proposed goal-oriented structuring of replay buffer allows the agent to automatically highlight sub-goals for solving complex hierarchical tasks in demonstrations.
The solution based on our algorithm beats all the solutions for the famous MineRL competition and allows the agent to mine a diamond in the Minecraft environment.
arXiv Detail & Related papers (2020-06-17T15:38:40Z) - Self-Supervised Discovering of Interpretable Features for Reinforcement
Learning [40.52278913726904]
We propose a self-supervised interpretable framework for deep reinforcement learning.
A self-supervised interpretable network (SSINet) is employed to produce fine-grained attention masks for highlighting task-relevant information.
We verify and evaluate our method on several Atari 2600 games as well as Duckietown, which is a challenging self-driving car simulator environment.
arXiv Detail & Related papers (2020-03-16T08:26:17Z) - Learn to Interpret Atari Agents [106.21468537372995]
Region-sensitive Rainbow (RS-Rainbow) is an end-to-end trainable network based on the original Rainbow, a powerful deep Q-network agent.
Our proposed agent, named region-sensitive Rainbow (RS-Rainbow), is an end-to-end trainable network based on the original Rainbow, a powerful deep Q-network agent.
arXiv Detail & Related papers (2018-12-29T03:35:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.