Related papers: Action Q-Transformer: Visual Explanation in Deep Reinforcement Learning with Encoder-Decoder Model using Action Query

Action Q-Transformer: Visual Explanation in Deep Reinforcement Learning with Encoder-Decoder Model using Action Query

URL: http://arxiv.org/abs/2306.13879v1
Date: Sat, 24 Jun 2023 07:06:14 GMT
Title: Action Q-Transformer: Visual Explanation in Deep Reinforcement Learning with Encoder-Decoder Model using Action Query
Authors: Hidenori Itaya, Tsubasa Hirakawa, Takayoshi Yamashita, Hironobu Fujiyoshi, Komei Sugiura
Abstract summary: Action Q-Transformer (AQT) introduces a transformer encoder-decoder structure to Q-learning based DRL methods. We show that visualization of attention in Atari 2600 games enables detailed analysis of agents' decision-making in various game tasks.
Score: 7.290230029542328
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The excellent performance of Transformer in supervised learning has led to growing interest in its potential application to deep reinforcement learning (DRL) to achieve high performance on a wide variety of problems. However, the decision making of a DRL agent is a black box, which greatly hinders the application of the agent to real-world problems. To address this problem, we propose the Action Q-Transformer (AQT), which introduces a transformer encoder-decoder structure to Q-learning based DRL methods. In AQT, the encoder calculates the state value function and the decoder calculates the advantage function to promote the acquisition of different attentions indicating the agent's decision-making. The decoder in AQT utilizes action queries, which represent the information of each action, as queries. This enables us to obtain the attentions for the state value and for each action. By acquiring and visualizing these attentions that detail the agent's decision-making, we achieve a DRL model with high interpretability. In this paper, we show that visualization of attention in Atari 2600 games enables detailed analysis of agents' decision-making in various game tasks. Further, experimental results demonstrate that our method can achieve higher performance than the baseline in some games.

Related papers

RL-RC-DoT: A Block-level RL agent for Task-Aware Video Compression [68.31184784672227]
In modern applications such as autonomous driving, an overwhelming majority of videos serve as input for AI systems performing tasks. It is therefore useful to optimize the encoder for a downstream task instead of for image quality. Here, we address this challenge by controlling the Quantization Parameters (QPs) at the macro-block level to optimize the downstream task.
arXiv Detail & Related papers (2025-01-21T15:36:08Z)
An Examination of Offline-Trained Encoders in Vision-Based Deep Reinforcement Learning for Autonomous Driving [0.0]
Research investigates the challenges Deep Reinforcement Learning (DRL) faces in Partially Observable Markov Decision Processes (POMDP) Our research adopts an offline-trained encoder to leverage large video datasets through self-supervised learning to learn generalizable representations. We show that the features learned by watching BDD100K driving videos can be directly transferred to achieve lane following and collision avoidance in CARLA simulator.
arXiv Detail & Related papers (2024-09-02T14:16:23Z)
Q-Transformer: Scalable Offline Reinforcement Learning via Autoregressive Q-Functions [143.89572689302497]
We present a scalable reinforcement learning method for training multi-task policies from large offline datasets. Our method uses a Transformer to provide a scalable representation for Q-functions trained via offline temporal difference backups. We show that Q-Transformer outperforms prior offline RL algorithms and imitation learning techniques on a large diverse real-world robotic manipulation task suite.
arXiv Detail & Related papers (2023-09-18T21:00:38Z)
Retrieval-Augmented Reinforcement Learning [63.32076191982944]
We train a network to map a dataset of past experiences to optimal behavior. The retrieval process is trained to retrieve information from the dataset that may be useful in the current context. We show that retrieval-augmented R2D2 learns significantly faster than the baseline R2D2 agent and achieves higher scores.
arXiv Detail & Related papers (2022-02-17T02:44:05Z)
Explaining Deep Reinforcement Learning Agents In The Atari Domain through a Surrogate Model [78.69367679848632]
We describe a lightweight and effective method to derive explanations for deep RL agents. Our method relies on a transformation of the pixel-based input of the RL agent to an interpretable, percept-like input representation. We then train a surrogate model, which is itself interpretable, to replicate the behavior of the target, deep RL agent.
arXiv Detail & Related papers (2021-10-07T05:01:44Z)
Visual Explanation using Attention Mechanism in Actor-Critic-based Deep Reinforcement Learning [9.49864824780503]
We propose Mask-Attention A3C (Mask A3C), which introduces an attention mechanism into Asynchronous Advantage Actor-Critic (A3C) A3C consists of a feature extractor that extracts features from an image, a policy branch that outputs the policy, and a value branch that outputs the state value. We visualized mask-attention maps for games on the Atari 2600 and found we could easily analyze the reasons behind an agent's decision-making.
arXiv Detail & Related papers (2021-03-06T08:38:12Z)
Hierarchical Variational Autoencoder for Visual Counterfactuals [79.86967775454316]
Conditional Variational Autos (VAE) are gathering significant attention as an Explainable Artificial Intelligence (XAI) tool. In this paper we show how relaxing the effect of the posterior leads to successful counterfactuals. We introduce VAEX an Hierarchical VAE designed for this approach that can visually audit a classifier in applications.
arXiv Detail & Related papers (2021-02-01T14:07:11Z)
Deep Surrogate Q-Learning for Autonomous Driving [17.30342128504405]
We propose Surrogate Q-learning for learning lane-change behavior for autonomous driving. We show that the architecture leads to a novel replay sampling technique we call Scene-centric Experience Replay. We also show that our methods enhance real-world applicability of RL systems by learning policies on the real highD dataset.
arXiv Detail & Related papers (2020-10-21T19:49:06Z)
Forgetful Experience Replay in Hierarchical Reinforcement Learning from Demonstrations [55.41644538483948]
In this paper, we propose a combination of approaches that allow the agent to use low-quality demonstrations in complex vision-based environments. Our proposed goal-oriented structuring of replay buffer allows the agent to automatically highlight sub-goals for solving complex hierarchical tasks in demonstrations. The solution based on our algorithm beats all the solutions for the famous MineRL competition and allows the agent to mine a diamond in the Minecraft environment.
arXiv Detail & Related papers (2020-06-17T15:38:40Z)
Self-Supervised Discovering of Interpretable Features for Reinforcement Learning [40.52278913726904]
We propose a self-supervised interpretable framework for deep reinforcement learning. A self-supervised interpretable network (SSINet) is employed to produce fine-grained attention masks for highlighting task-relevant information. We verify and evaluate our method on several Atari 2600 games as well as Duckietown, which is a challenging self-driving car simulator environment.
arXiv Detail & Related papers (2020-03-16T08:26:17Z)
Learn to Interpret Atari Agents [106.21468537372995]
Region-sensitive Rainbow (RS-Rainbow) is an end-to-end trainable network based on the original Rainbow, a powerful deep Q-network agent. Our proposed agent, named region-sensitive Rainbow (RS-Rainbow), is an end-to-end trainable network based on the original Rainbow, a powerful deep Q-network agent.
arXiv Detail & Related papers (2018-12-29T03:35:32Z)

This list is automatically generated from the titles and abstracts of the papers in this site.