Explaining RL Decisions with Trajectories
- URL: http://arxiv.org/abs/2305.04073v2
- Date: Mon, 22 Jan 2024 12:00:58 GMT
- Title: Explaining RL Decisions with Trajectories
- Authors: Shripad Vilasrao Deshmukh, Arpan Dasgupta, Balaji Krishnamurthy, Nan
Jiang, Chirag Agarwal, Georgios Theocharous, Jayakumar Subramanian
- Abstract summary: Explanation is a key component for the adoption of reinforcement learning (RL) in many real-world decision-making problems.
We propose a complementary approach to these explanations, particularly for offline RL, where we attribute the policy decisions of a trained RL agent to the trajectories encountered by it during training.
- Score: 28.261758841898697
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Explanation is a key component for the adoption of reinforcement learning
(RL) in many real-world decision-making problems. In the literature, the
explanation is often provided by saliency attribution to the features of the RL
agent's state. In this work, we propose a complementary approach to these
explanations, particularly for offline RL, where we attribute the policy
decisions of a trained RL agent to the trajectories encountered by it during
training. To do so, we encode trajectories in offline training data
individually as well as collectively (encoding a set of trajectories). We then
attribute policy decisions to a set of trajectories in this encoded space by
estimating the sensitivity of the decision with respect to that set. Further,
we demonstrate the effectiveness of the proposed approach in terms of quality
of attributions as well as practical scalability in diverse environments that
involve both discrete and continuous state and action spaces such as
grid-worlds, video games (Atari) and continuous control (MuJoCo). We also
conduct a human study on a simple navigation task to observe how their
understanding of the task compares with data attributed for a trained RL
policy. Keywords -- Explainable AI, Verifiability of AI Decisions, Explainable
RL.
Related papers
- ODRL: A Benchmark for Off-Dynamics Reinforcement Learning [59.72217833812439]
We introduce ODRL, the first benchmark tailored for evaluating off-dynamics RL methods.
ODRL contains four experimental settings where the source and target domains can be either online or offline.
We conduct extensive benchmarking experiments, which show that no method has universal advantages across varied dynamics shifts.
arXiv Detail & Related papers (2024-10-28T05:29:38Z) - Diffusion-Based Offline RL for Improved Decision-Making in Augmented ARC Task [10.046325073900297]
We introduce an augmented offline RL dataset for Abstraction and Reasoning (SOLAR)
SOLAR enables the application of offline RL methods by offering sufficient experience data.
Our experiments demonstrate the effectiveness of the offline RL approach on a simple ARC task.
arXiv Detail & Related papers (2024-10-15T06:48:27Z) - Offline Reinforcement Learning from Datasets with Structured Non-Stationarity [50.35634234137108]
Current Reinforcement Learning (RL) is often limited by the large amount of data needed to learn a successful policy.
We address a novel Offline RL problem setting in which, while collecting the dataset, the transition and reward functions gradually change between episodes but stay constant within each episode.
We propose a method based on Contrastive Predictive Coding that identifies this non-stationarity in the offline dataset, accounts for it when training a policy, and predicts it during evaluation.
arXiv Detail & Related papers (2024-05-23T02:41:36Z) - Demystifying the Physics of Deep Reinforcement Learning-Based Autonomous Vehicle Decision-Making [6.243971093896272]
We use a continuous proximal policy optimization-based DRL algorithm as the baseline model and add a multi-head attention framework in an open-source AV simulation environment.
We show that the weights in the first head encode the positions of the neighboring vehicles while the second head focuses on the leader vehicle exclusively.
arXiv Detail & Related papers (2024-03-18T02:59:13Z) - Solving Offline Reinforcement Learning with Decision Tree Regression [0.0]
This study presents a novel approach to addressing offline reinforcement learning problems by reframing them as regression tasks.
We introduce two distinct frameworks: return-conditioned and return-weighted decision tree policies.
Despite the simplification inherent in this reformulated approach to offline RL, our agents demonstrate performance that is at least on par with the established methods.
arXiv Detail & Related papers (2024-01-21T23:50:46Z) - Bridging Distributionally Robust Learning and Offline RL: An Approach to
Mitigate Distribution Shift and Partial Data Coverage [32.578787778183546]
offline reinforcement learning (RL) algorithms learn optimal polices using historical (offline) data.
One of the main challenges in offline RL is the distribution shift.
We propose two offline RL algorithms using the distributionally robust learning (DRL) framework.
arXiv Detail & Related papers (2023-10-27T19:19:30Z) - Action-Quantized Offline Reinforcement Learning for Robotic Skill
Learning [68.16998247593209]
offline reinforcement learning (RL) paradigm provides recipe to convert static behavior datasets into policies that can perform better than the policy that collected the data.
In this paper, we propose an adaptive scheme for action quantization.
We show that several state-of-the-art offline RL methods such as IQL, CQL, and BRAC improve in performance on benchmarks when combined with our proposed discretization scheme.
arXiv Detail & Related papers (2023-10-18T06:07:10Z) - Leveraging Reward Consistency for Interpretable Feature Discovery in
Reinforcement Learning [69.19840497497503]
It is argued that the commonly used action matching principle is more like an explanation of deep neural networks (DNNs) than the interpretation of RL agents.
We propose to consider rewards, the essential objective of RL agents, as the essential objective of interpreting RL agents.
We verify and evaluate our method on the Atari 2600 games as well as Duckietown, a challenging self-driving car simulator environment.
arXiv Detail & Related papers (2023-09-04T09:09:54Z) - Explainable Reinforcement Learning for Broad-XAI: A Conceptual Framework
and Survey [0.7366405857677226]
Reinforcement Learning (RL) methods provide a potential backbone for the cognitive model required for the development of Broad-XAI.
RL represents a suite of approaches that have had increasing success in solving a range of sequential decision-making problems.
This paper aims to introduce a conceptual framework, called the Causal XRL Framework (CXF), that unifies the current XRL research and uses RL as a backbone to the development of Broad-XAI.
arXiv Detail & Related papers (2021-08-20T05:18:50Z) - EMaQ: Expected-Max Q-Learning Operator for Simple Yet Effective Offline
and Online RL [48.552287941528]
Off-policy reinforcement learning holds the promise of sample-efficient learning of decision-making policies.
In the offline RL setting, standard off-policy RL methods can significantly underperform.
We introduce Expected-Max Q-Learning (EMaQ), which is more closely related to the resulting practical algorithm.
arXiv Detail & Related papers (2020-07-21T21:13:02Z) - RL Unplugged: A Suite of Benchmarks for Offline Reinforcement Learning [108.9599280270704]
We propose a benchmark called RL Unplugged to evaluate and compare offline RL methods.
RL Unplugged includes data from a diverse range of domains including games and simulated motor control problems.
We will release data for all our tasks and open-source all algorithms presented in this paper.
arXiv Detail & Related papers (2020-06-24T17:14:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.