Explaining Reinforcement Learning with Shapley Values
- URL: http://arxiv.org/abs/2306.05810v1
- Date: Fri, 9 Jun 2023 10:52:39 GMT
- Title: Explaining Reinforcement Learning with Shapley Values
- Authors: Daniel Beechey, Thomas M. S. Smith, \"Ozg\"ur \c{S}im\c{s}ek
- Abstract summary: We present a theoretical analysis of explaining reinforcement learning using Shapley values.
Our analysis exposes the limitations of earlier uses of Shapley values in reinforcement learning.
We then develop an approach that uses Shapley values to explain agent performance.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: For reinforcement learning systems to be widely adopted, their users must
understand and trust them. We present a theoretical analysis of explaining
reinforcement learning using Shapley values, following a principled approach
from game theory for identifying the contribution of individual players to the
outcome of a cooperative game. We call this general framework Shapley Values
for Explaining Reinforcement Learning (SVERL). Our analysis exposes the
limitations of earlier uses of Shapley values in reinforcement learning. We
then develop an approach that uses Shapley values to explain agent performance.
In a variety of domains, SVERL produces meaningful explanations that match and
supplement human intuition.
Related papers
- Improve Vision Language Model Chain-of-thought Reasoning [86.83335752119741]
Chain-of-thought (CoT) reasoning in vision language models (VLMs) is crucial for improving interpretability and trustworthiness.
We show that training VLM on short answers does not generalize well to reasoning tasks that require more detailed responses.
arXiv Detail & Related papers (2024-10-21T17:00:06Z) - Improving the Sampling Strategy in KernelSHAP [0.8057006406834466]
KernelSHAP framework enables us to approximate the Shapley values using a sampled subset of weighted conditional expectations.
We propose three main novel contributions: a stabilizing technique to reduce the variance of the weights in the current state-of-the-art strategy, a novel weighing scheme that corrects the Shapley kernel weights based on sampled subsets, and a straightforward strategy that includes the important subsets and integrates them with the corrected Shapley kernel weights.
arXiv Detail & Related papers (2024-10-07T10:02:31Z) - Shapley Pruning for Neural Network Compression [63.60286036508473]
This work presents the Shapley value approximations, and performs the comparative analysis in terms of cost-benefit utility for the neural network compression.
The proposed normative ranking and its approximations show practical results, obtaining state-of-the-art network compression.
arXiv Detail & Related papers (2024-07-19T11:42:54Z) - Learning to Generate Explainable Stock Predictions using Self-Reflective
Large Language Models [54.21695754082441]
We propose a framework to teach Large Language Models (LLMs) to generate explainable stock predictions.
A reflective agent learns how to explain past stock movements through self-reasoning, while the PPO trainer trains the model to generate the most likely explanations.
Our framework can outperform both traditional deep-learning and LLM methods in prediction accuracy and Matthews correlation coefficient.
arXiv Detail & Related papers (2024-02-06T03:18:58Z) - Inverse Reinforcement Learning for Text Summarization [52.765898203824975]
We introduce inverse reinforcement learning (IRL) as an effective paradigm for training abstractive summarization models.
Experimental results across datasets in different domains demonstrate the superiority of our proposed IRL model for summarization over MLE and RL baselines.
arXiv Detail & Related papers (2022-12-19T23:45:05Z) - On the Convergence of the Shapley Value in Parametric Bayesian Learning
Games [28.212413634171572]
We show that for any two players, their difference in Shapley value converges in probability to the difference in Shapley value of a limiting game whose characteristic function is proportional to the log-determinant of the joint Fisher information.
Our result enables this to be achieved without any costly computations of posterior-prior KL divergences.
arXiv Detail & Related papers (2022-05-16T02:29:14Z) - The Shapley Value in Machine Learning [5.867472712737402]
We give an overview of the most important applications of the Shapley value in machine learning.
We examine the most crucial limitations of the Shapley value and point out directions for future research.
arXiv Detail & Related papers (2022-02-11T13:25:11Z) - Collective eXplainable AI: Explaining Cooperative Strategies and Agent
Contribution in Multiagent Reinforcement Learning with Shapley Values [68.8204255655161]
This study proposes a novel approach to explain cooperative strategies in multiagent RL using Shapley values.
Results could have implications for non-discriminatory decision making, ethical and responsible AI-derived decisions or policy making under fairness constraints.
arXiv Detail & Related papers (2021-10-04T10:28:57Z) - Evaluating Explanations: How much do explanations from the teacher aid
students? [103.05037537415811]
We formalize the value of explanations using a student-teacher paradigm that measures the extent to which explanations improve student models in learning.
Unlike many prior proposals to evaluate explanations, our approach cannot be easily gamed, enabling principled, scalable, and automatic evaluation of attributions.
arXiv Detail & Related papers (2020-12-01T23:40:21Z) - Problems with Shapley-value-based explanations as feature importance
measures [12.08945475767566]
Game-theoretic formulations of feature importance have become popular as a way to "explain" machine learning models.
We show that mathematical problems arise when Shapley values are used for feature importance.
We argue that Shapley values do not provide explanations which suit human-centric goals of explainability.
arXiv Detail & Related papers (2020-02-25T18:51:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.