Action Q-Transformer: Visual Explanation in Deep Reinforcement Learning
with Encoder-Decoder Model using Action Query
- URL: http://arxiv.org/abs/2306.13879v1
- Date: Sat, 24 Jun 2023 07:06:14 GMT
- Title: Action Q-Transformer: Visual Explanation in Deep Reinforcement Learning
with Encoder-Decoder Model using Action Query
- Authors: Hidenori Itaya, Tsubasa Hirakawa, Takayoshi Yamashita, Hironobu
Fujiyoshi, Komei Sugiura
- Abstract summary: Action Q-Transformer (AQT) introduces a transformer encoder-decoder structure to Q-learning based DRL methods.
We show that visualization of attention in Atari 2600 games enables detailed analysis of agents' decision-making in various game tasks.
- Score: 7.290230029542328
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The excellent performance of Transformer in supervised learning has led to
growing interest in its potential application to deep reinforcement learning
(DRL) to achieve high performance on a wide variety of problems. However, the
decision making of a DRL agent is a black box, which greatly hinders the
application of the agent to real-world problems. To address this problem, we
propose the Action Q-Transformer (AQT), which introduces a transformer
encoder-decoder structure to Q-learning based DRL methods. In AQT, the encoder
calculates the state value function and the decoder calculates the advantage
function to promote the acquisition of different attentions indicating the
agent's decision-making. The decoder in AQT utilizes action queries, which
represent the information of each action, as queries. This enables us to obtain
the attentions for the state value and for each action. By acquiring and
visualizing these attentions that detail the agent's decision-making, we
achieve a DRL model with high interpretability. In this paper, we show that
visualization of attention in Atari 2600 games enables detailed analysis of
agents' decision-making in various game tasks. Further, experimental results
demonstrate that our method can achieve higher performance than the baseline in
some games.
Related papers
- An Examination of Offline-Trained Encoders in Vision-Based Deep Reinforcement Learning for Autonomous Driving [0.0]
Research investigates the challenges Deep Reinforcement Learning (DRL) faces in Partially Observable Markov Decision Processes (POMDP)
Our research adopts an offline-trained encoder to leverage large video datasets through self-supervised learning to learn generalizable representations.
We show that the features learned by watching BDD100K driving videos can be directly transferred to achieve lane following and collision avoidance in CARLA simulator.
arXiv Detail & Related papers (2024-09-02T14:16:23Z) - Q-Transformer: Scalable Offline Reinforcement Learning via
Autoregressive Q-Functions [143.89572689302497]
We present a scalable reinforcement learning method for training multi-task policies from large offline datasets.
Our method uses a Transformer to provide a scalable representation for Q-functions trained via offline temporal difference backups.
We show that Q-Transformer outperforms prior offline RL algorithms and imitation learning techniques on a large diverse real-world robotic manipulation task suite.
arXiv Detail & Related papers (2023-09-18T21:00:38Z) - Retrieval-Augmented Reinforcement Learning [63.32076191982944]
We train a network to map a dataset of past experiences to optimal behavior.
The retrieval process is trained to retrieve information from the dataset that may be useful in the current context.
We show that retrieval-augmented R2D2 learns significantly faster than the baseline R2D2 agent and achieves higher scores.
arXiv Detail & Related papers (2022-02-17T02:44:05Z) - Explaining Deep Reinforcement Learning Agents In The Atari Domain
through a Surrogate Model [78.69367679848632]
We describe a lightweight and effective method to derive explanations for deep RL agents.
Our method relies on a transformation of the pixel-based input of the RL agent to an interpretable, percept-like input representation.
We then train a surrogate model, which is itself interpretable, to replicate the behavior of the target, deep RL agent.
arXiv Detail & Related papers (2021-10-07T05:01:44Z) - Visual Explanation using Attention Mechanism in Actor-Critic-based Deep
Reinforcement Learning [9.49864824780503]
We propose Mask-Attention A3C (Mask A3C), which introduces an attention mechanism into Asynchronous Advantage Actor-Critic (A3C)
A3C consists of a feature extractor that extracts features from an image, a policy branch that outputs the policy, and a value branch that outputs the state value.
We visualized mask-attention maps for games on the Atari 2600 and found we could easily analyze the reasons behind an agent's decision-making.
arXiv Detail & Related papers (2021-03-06T08:38:12Z) - Hierarchical Variational Autoencoder for Visual Counterfactuals [79.86967775454316]
Conditional Variational Autos (VAE) are gathering significant attention as an Explainable Artificial Intelligence (XAI) tool.
In this paper we show how relaxing the effect of the posterior leads to successful counterfactuals.
We introduce VAEX an Hierarchical VAE designed for this approach that can visually audit a classifier in applications.
arXiv Detail & Related papers (2021-02-01T14:07:11Z) - Deep Surrogate Q-Learning for Autonomous Driving [17.30342128504405]
We propose Surrogate Q-learning for learning lane-change behavior for autonomous driving.
We show that the architecture leads to a novel replay sampling technique we call Scene-centric Experience Replay.
We also show that our methods enhance real-world applicability of RL systems by learning policies on the real highD dataset.
arXiv Detail & Related papers (2020-10-21T19:49:06Z) - Forgetful Experience Replay in Hierarchical Reinforcement Learning from
Demonstrations [55.41644538483948]
In this paper, we propose a combination of approaches that allow the agent to use low-quality demonstrations in complex vision-based environments.
Our proposed goal-oriented structuring of replay buffer allows the agent to automatically highlight sub-goals for solving complex hierarchical tasks in demonstrations.
The solution based on our algorithm beats all the solutions for the famous MineRL competition and allows the agent to mine a diamond in the Minecraft environment.
arXiv Detail & Related papers (2020-06-17T15:38:40Z) - Self-Supervised Discovering of Interpretable Features for Reinforcement
Learning [40.52278913726904]
We propose a self-supervised interpretable framework for deep reinforcement learning.
A self-supervised interpretable network (SSINet) is employed to produce fine-grained attention masks for highlighting task-relevant information.
We verify and evaluate our method on several Atari 2600 games as well as Duckietown, which is a challenging self-driving car simulator environment.
arXiv Detail & Related papers (2020-03-16T08:26:17Z) - Learn to Interpret Atari Agents [106.21468537372995]
Region-sensitive Rainbow (RS-Rainbow) is an end-to-end trainable network based on the original Rainbow, a powerful deep Q-network agent.
Our proposed agent, named region-sensitive Rainbow (RS-Rainbow), is an end-to-end trainable network based on the original Rainbow, a powerful deep Q-network agent.
arXiv Detail & Related papers (2018-12-29T03:35:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.