Related papers: Visual Explanation using Attention Mechanism in Actor-Critic-based Deep Reinforcement Learning

Visual Explanation using Attention Mechanism in Actor-Critic-based Deep Reinforcement Learning

URL: http://arxiv.org/abs/2103.04067v1
Date: Sat, 6 Mar 2021 08:38:12 GMT
Title: Visual Explanation using Attention Mechanism in Actor-Critic-based Deep Reinforcement Learning
Authors: Hidenori Itaya, Tsubasa Hirakawa, Takayoshi Yamashita, Hironobu Fujiyoshi, Komei Sugiura
Abstract summary: We propose Mask-Attention A3C (Mask A3C), which introduces an attention mechanism into Asynchronous Advantage Actor-Critic (A3C) A3C consists of a feature extractor that extracts features from an image, a policy branch that outputs the policy, and a value branch that outputs the state value. We visualized mask-attention maps for games on the Atari 2600 and found we could easily analyze the reasons behind an agent's decision-making.
Score: 9.49864824780503
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Deep reinforcement learning (DRL) has great potential for acquiring the optimal action in complex environments such as games and robot control. However, it is difficult to analyze the decision-making of the agent, i.e., the reasons it selects the action acquired by learning. In this work, we propose Mask-Attention A3C (Mask A3C), which introduces an attention mechanism into Asynchronous Advantage Actor-Critic (A3C), which is an actor-critic-based DRL method, and can analyze the decision-making of an agent in DRL. A3C consists of a feature extractor that extracts features from an image, a policy branch that outputs the policy, and a value branch that outputs the state value. In this method, we focus on the policy and value branches and introduce an attention mechanism into them. The attention mechanism applies a mask processing to the feature maps of each branch using mask-attention that expresses the judgment reason for the policy and state value with a heat map. We visualized mask-attention maps for games on the Atari 2600 and found we could easily analyze the reasons behind an agent's decision-making in various game tasks. Furthermore, experimental results showed that the agent could achieve a higher performance by introducing the attention mechanism.

Related papers

Why the Agent Made that Decision: Explaining Deep Reinforcement Learning with Vision Masks [11.068220265247385]
VisionMask is a standalone explanation model trained end-to-end to identify the most critical regions in the agent's visual input that can explain its actions. It achieves a 14.9% higher insertion accuracy and a 30.08% higher F1-Score in reproducing original actions from selected visual explanations.
arXiv Detail & Related papers (2024-11-25T06:11:46Z)
DEAR: Disentangled Environment and Agent Representations for Reinforcement Learning without Reconstruction [4.813546138483559]
Reinforcement Learning (RL) algorithms can learn robotic control tasks from visual observations, but they often require a large amount of data. In this paper, we explore how the agent's knowledge of its shape can improve the sample efficiency of visual RL methods. We propose a novel method, Disentangled Environment and Agent Representations, that uses the segmentation mask of the agent as supervision.
arXiv Detail & Related papers (2024-06-30T09:15:21Z)
Agent Attention: On the Integration of Softmax and Linear Attention [70.06472039237354]
We propose a novel attention paradigm, Agent Attention, to strike a favorable balance between computational efficiency and representation power. We show that the proposed agent attention is equivalent to a generalized form of linear attention. Notably, agent attention has shown remarkable performance in high-resolution scenarios, owning to its linear attention nature.
arXiv Detail & Related papers (2023-12-14T16:26:29Z)
Advantage Actor-Critic with Reasoner: Explaining the Agent's Behavior from an Exploratory Perspective [19.744322603358402]
We propose a novel Advantage Actor-Critic with Reasoner (A2CR) A2CR automatically generates a more comprehensive and interpretable paradigm for understanding the agent's decision-making process. It offers a range of functionalities such as purpose-based saliency, early failure detection, and model supervision.
arXiv Detail & Related papers (2023-09-09T07:19:20Z)
Action Q-Transformer: Visual Explanation in Deep Reinforcement Learning with Encoder-Decoder Model using Action Query [7.290230029542328]
Action Q-Transformer (AQT) introduces a transformer encoder-decoder structure to Q-learning based DRL methods. We show that visualization of attention in Atari 2600 games enables detailed analysis of agents' decision-making in various game tasks.
arXiv Detail & Related papers (2023-06-24T07:06:14Z)
Pessimism meets VCG: Learning Dynamic Mechanism Design via Offline Reinforcement Learning [114.36124979578896]
We design a dynamic mechanism using offline reinforcement learning algorithms. Our algorithm is based on the pessimism principle and only requires a mild assumption on the coverage of the offline data set.
arXiv Detail & Related papers (2022-05-05T05:44:26Z)
Automated Machine Learning, Bounded Rationality, and Rational Metareasoning [62.997667081978825]
We will look at automated machine learning (AutoML) and related problems from the perspective of bounded rationality. Taking actions under bounded resources requires an agent to reflect on how to use these resources in an optimal way.
arXiv Detail & Related papers (2021-09-10T09:10:20Z)
Explore and Control with Adversarial Surprise [78.41972292110967]
Reinforcement learning (RL) provides a framework for learning goal-directed policies given user-specified rewards. We propose a new unsupervised RL technique based on an adversarial game which pits two policies against each other to compete over the amount of surprise an RL agent experiences. We show that our method leads to the emergence of complex skills by exhibiting clear phase transitions.
arXiv Detail & Related papers (2021-07-12T17:58:40Z)
Forgetful Experience Replay in Hierarchical Reinforcement Learning from Demonstrations [55.41644538483948]
In this paper, we propose a combination of approaches that allow the agent to use low-quality demonstrations in complex vision-based environments. Our proposed goal-oriented structuring of replay buffer allows the agent to automatically highlight sub-goals for solving complex hierarchical tasks in demonstrations. The solution based on our algorithm beats all the solutions for the famous MineRL competition and allows the agent to mine a diamond in the Minecraft environment.
arXiv Detail & Related papers (2020-06-17T15:38:40Z)
Self-Supervised Discovering of Interpretable Features for Reinforcement Learning [40.52278913726904]
We propose a self-supervised interpretable framework for deep reinforcement learning. A self-supervised interpretable network (SSINet) is employed to produce fine-grained attention masks for highlighting task-relevant information. We verify and evaluate our method on several Atari 2600 games as well as Duckietown, which is a challenging self-driving car simulator environment.
arXiv Detail & Related papers (2020-03-16T08:26:17Z)
Learn to Interpret Atari Agents [106.21468537372995]
Region-sensitive Rainbow (RS-Rainbow) is an end-to-end trainable network based on the original Rainbow, a powerful deep Q-network agent. Our proposed agent, named region-sensitive Rainbow (RS-Rainbow), is an end-to-end trainable network based on the original Rainbow, a powerful deep Q-network agent.
arXiv Detail & Related papers (2018-12-29T03:35:32Z)

This list is automatically generated from the titles and abstracts of the papers in this site.