Approximating Shapley Explanations in Reinforcement Learning
- URL: http://arxiv.org/abs/2511.06094v1
- Date: Sat, 08 Nov 2025 18:17:18 GMT
- Title: Approximating Shapley Explanations in Reinforcement Learning
- Authors: Daniel Beechey, Özgür Şimşek,
- Abstract summary: We introduce FastSVERL, a scalable method for explaining reinforcement learning by approximating Shapley values.<n>FastSVERL is designed to handle the unique challenges of reinforcement learning, including temporal dependencies across multi-step trajectories.
- Score: 1.1458853556386799
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Reinforcement learning has achieved remarkable success in complex decision-making environments, yet its lack of transparency limits its deployment in practice, especially in safety-critical settings. Shapley values from cooperative game theory provide a principled framework for explaining reinforcement learning; however, the computational cost of Shapley explanations is an obstacle to their use. We introduce FastSVERL, a scalable method for explaining reinforcement learning by approximating Shapley values. FastSVERL is designed to handle the unique challenges of reinforcement learning, including temporal dependencies across multi-step trajectories, learning from off-policy data, and adapting to evolving agent behaviours in real time. FastSVERL introduces a practical, scalable approach for principled and rigorous interpretability in reinforcement learning.
Related papers
- Accordion-Thinking: Self-Regulated Step Summaries for Efficient and Readable LLM Reasoning [62.680551162054975]
We introduce an end-to-end framework where LLMs learn to self-regulate the granularity of the reasoning steps through dynamic summarization.<n>We apply reinforcement learning to incentivize this capability further, uncovering a critical insight: the accuracy gap between the highly efficient Fold mode and the exhaustive Unfold mode progressively narrows.<n>Our Accordion-Thinker demonstrates that with learned self-compression, LLMs can tackle complex reasoning tasks with minimal dependency token overhead.
arXiv Detail & Related papers (2026-02-03T08:34:20Z) - Retrieval-augmented Prompt Learning for Pre-trained Foundation Models [101.13972024610733]
We present RetroPrompt, which aims to achieve a balance between memorization and generalization.<n>Unlike traditional prompting methods, RetroPrompt incorporates a retrieval mechanism throughout the input, training, and inference stages.<n>We conduct comprehensive experiments on a variety of datasets across natural language processing and computer vision tasks to demonstrate the superior performance of our proposed approach.
arXiv Detail & Related papers (2025-12-23T08:15:34Z) - Stabilizing Reinforcement Learning with LLMs: Formulation and Practices [61.361819972410046]
We show why and under what conditions the true sequence-level reward can be optimized via a surrogate token-level objective in policy gradient methods such as REINFORCE.<n>This insight provides a principled explanation for the crucial role of several widely adopted techniques in stabilizing RL training.
arXiv Detail & Related papers (2025-12-01T07:45:39Z) - Learning safe, constrained policies via imitation learning: Connection to Probabilistic Inference and a Naive Algorithm [0.22099217573031676]
This article introduces an imitation learning method for learning maximum entropy policies that comply with constraints demonstrated by expert executing a task.<n> Experiments show that the method can learn effective policy models for constraints-abiding behaviour, in settings with multiple constraints of different types, and with abilities to generalize.
arXiv Detail & Related papers (2025-07-09T12:11:27Z) - Guided Policy Optimization under Partial Observability [36.853129816484845]
Reinforcement Learning (RL) in partially observable environments poses significant challenges due to the complexity of learning under uncertainty.<n>We introduce Guided Policy Optimization (GPO), a framework that co-trains a guider and a learner.<n>We theoretically demonstrate that this learning scheme achieves optimality comparable to direct RL, thereby overcoming key limitations inherent in existing approaches.
arXiv Detail & Related papers (2025-05-21T12:01:08Z) - A Theoretical Framework for Explaining Reinforcement Learning with Shapley Values [0.0]
Reinforcement learning agents can achieve super-human performance in complex decision-making tasks, but their behaviour is often difficult to understand and explain.<n>We identify three core explanatory targets that together provide a comprehensive view of reinforcement learning agents.<n>We develop a unified theoretical framework for explaining these three elements of reinforcement learning agents through the influence of individual features that the agent observes in its environment.
arXiv Detail & Related papers (2025-05-12T17:48:28Z) - Online inductive learning from answer sets for efficient reinforcement learning exploration [52.03682298194168]
We exploit inductive learning of answer set programs to learn a set of logical rules representing an explainable approximation of the agent policy.<n>We then perform answer set reasoning on the learned rules to guide the exploration of the learning agent at the next batch.<n>Our methodology produces a significant boost in the discounted return achieved by the agent, even in the first batches of training.
arXiv Detail & Related papers (2025-01-13T16:13:22Z) - Explaining Reinforcement Learning with Shapley Values [0.0]
We present a theoretical analysis of explaining reinforcement learning using Shapley values.
Our analysis exposes the limitations of earlier uses of Shapley values in reinforcement learning.
We then develop an approach that uses Shapley values to explain agent performance.
arXiv Detail & Related papers (2023-06-09T10:52:39Z) - Resilient Constrained Learning [94.27081585149836]
This paper presents a constrained learning approach that adapts the requirements while simultaneously solving the learning task.
We call this approach resilient constrained learning after the term used to describe ecological systems that adapt to disruptions by modifying their operation.
arXiv Detail & Related papers (2023-06-04T18:14:18Z) - Stabilizing Q-learning with Linear Architectures for Provably Efficient
Learning [53.17258888552998]
This work proposes an exploration variant of the basic $Q$-learning protocol with linear function approximation.
We show that the performance of the algorithm degrades very gracefully under a novel and more permissive notion of approximation error.
arXiv Detail & Related papers (2022-06-01T23:26:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.