Related papers: Understanding and Diagnosing Deep Reinforcement Learning

Understanding and Diagnosing Deep Reinforcement Learning

URL: http://arxiv.org/abs/2406.16979v1
Date: Sun, 23 Jun 2024 18:10:16 GMT
Title: Understanding and Diagnosing Deep Reinforcement Learning
Authors: Ezgi Korkmaz,
Abstract summary: Deep neural policies have recently been installed in a diverse range of settings, from biotechnology to automated financial systems. We introduce a theoretically founded technique that provides a systematic analysis of the directions in the deep neural policy decision decision both time and space.
Score: 14.141453107129403
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Deep neural policies have recently been installed in a diverse range of settings, from biotechnology to automated financial systems. However, the utilization of deep neural networks to approximate the value function leads to concerns on the decision boundary stability, in particular, with regard to the sensitivity of policy decision making to indiscernible, non-robust features due to highly non-convex and complex deep neural manifolds. These concerns constitute an obstruction to understanding the reasoning made by deep neural policies, and their foundational limitations. Hence, it is crucial to develop techniques that aim to understand the sensitivities in the learnt representations of neural network policies. To achieve this we introduce a theoretically founded method that provides a systematic analysis of the unstable directions in the deep neural policy decision boundary across both time and space. Through experiments in the Arcade Learning Environment (ALE), we demonstrate the effectiveness of our technique for identifying correlated directions of instability, and for measuring how sample shifts remold the set of sensitive directions in the neural policy landscape. Most importantly, we demonstrate that state-of-the-art robust training techniques yield learning of disjoint unstable directions, with dramatically larger oscillations over time, when compared to standard training. We believe our results reveal the fundamental properties of the decision process made by reinforcement learning policies, and can help in constructing reliable and robust deep neural policies.

Related papers

Certified Neural Approximations of Nonlinear Dynamics [52.79163248326912]
In safety-critical contexts, the use of neural approximations requires formal bounds on their closeness to the underlying system.<n>We propose a novel, adaptive, and parallelizable verification method based on certified first-order models.
arXiv Detail & Related papers (2025-05-21T13:22:20Z)
Neural DNF-MT: A Neuro-symbolic Approach for Learning Interpretable and Editable Policies [51.03989561425833]
We propose a neuro-symbolic approach called neural DNF-MT for end-to-end policy learning. The differentiable nature of the neural DNF-MT model enables the use of deep actor-critic algorithms for training. We show how the bivalent representations of deterministic policies can be edited and incorporated back into a neural model.
arXiv Detail & Related papers (2025-01-07T15:51:49Z)
The Boundaries of Verifiable Accuracy, Robustness, and Generalisation in Deep Learning [71.14237199051276]
We consider classical distribution-agnostic framework and algorithms minimising empirical risks. We show that there is a large family of tasks for which computing and verifying ideal stable and accurate neural networks is extremely challenging.
arXiv Detail & Related papers (2023-09-13T16:33:27Z)
Addressing caveats of neural persistence with deep graph persistence [54.424983583720675]
We find that the variance of network weights and spatial concentration of large weights are the main factors that impact neural persistence. We propose an extension of the filtration underlying neural persistence to the whole neural network instead of single layers. This yields our deep graph persistence measure, which implicitly incorporates persistent paths through the network and alleviates variance-related issues.
arXiv Detail & Related papers (2023-07-20T13:34:11Z)
Detecting Adversarial Directions in Deep Reinforcement Learning to Make Robust Decisions [8.173034693197351]
We propose a novel method to detect the presence of non-robust directions in MDPs. Our method provides a theoretical basis for the fundamental cut-off between safe observations and adversarial observations. Most significantly, we demonstrate the effectiveness of our approach even in the setting where non-robust directions are explicitly optimized to circumvent our proposed method.
arXiv Detail & Related papers (2023-06-09T13:11:05Z)
Adversarial Robust Deep Reinforcement Learning Requires Redefining Robustness [7.6146285961466]
We show that high sensitivity directions are more abundant in the deep neural policy landscape and can be found via more natural means in a black-box setting. We show that vanilla training techniques intriguingly result in learning more robust policies compared to the policies learnt via the state-of-the-art adversarial training techniques.
arXiv Detail & Related papers (2023-01-17T16:54:33Z)
Interpreting Neural Policies with Disentangled Tree Representations [58.769048492254555]
We study interpretability of compact neural policies through the lens of disentangled representation. We leverage decision trees to obtain factors of variation for disentanglement in robot learning. We introduce interpretability metrics that measure disentanglement of learned neural dynamics.
arXiv Detail & Related papers (2022-10-13T01:10:41Z)
Deep Reinforcement Learning Policies Learn Shared Adversarial Features Across MDPs [0.0]
We propose a framework to investigate the decision boundary and loss landscape similarities across states and across MDPs. We conduct experiments in various games from Arcade Learning Environment, and discover that high sensitivity directions for neural policies are correlated across MDPs.
arXiv Detail & Related papers (2021-12-16T17:10:41Z)
Investigating Vulnerabilities of Deep Neural Policies [0.0]
Reinforcement learning policies based on deep neural networks are vulnerable to imperceptible adversarial perturbations to their inputs. Recent work has proposed several methods to improve the robustness of deep reinforcement learning agents to adversarial perturbations. We study the effects of adversarial training on the neural policy learned by the agent.
arXiv Detail & Related papers (2021-08-30T10:04:50Z)
Robust Explainability: A Tutorial on Gradient-Based Attribution Methods for Deep Neural Networks [1.5854438418597576]
We present gradient-based interpretability methods for explaining decisions of deep neural networks. We discuss the role that adversarial robustness plays in having meaningful explanations. We conclude with the future directions for research in the area at the convergence of robustness and explainability.
arXiv Detail & Related papers (2021-07-23T18:06:29Z)
A neural anisotropic view of underspecification in deep learning [60.119023683371736]
We show that the way neural networks handle the underspecification of problems is highly dependent on the data representation. Our results highlight that understanding the architectural inductive bias in deep learning is fundamental to address the fairness, robustness, and generalization of these systems.
arXiv Detail & Related papers (2021-04-29T14:31:09Z)
Attribute-Guided Adversarial Training for Robustness to Natural Perturbations [64.35805267250682]
We propose an adversarial training approach which learns to generate new samples so as to maximize exposure of the classifier to the attributes-space. Our approach enables deep neural networks to be robust against a wide range of naturally occurring perturbations.
arXiv Detail & Related papers (2020-12-03T10:17:30Z)
Developing Constrained Neural Units Over Time [81.19349325749037]
This paper focuses on an alternative way of defining Neural Networks, that is different from the majority of existing approaches. The structure of the neural architecture is defined by means of a special class of constraints that are extended also to the interaction with data. The proposed theory is cast into the time domain, in which data are presented to the network in an ordered manner.
arXiv Detail & Related papers (2020-09-01T09:07:25Z)

This list is automatically generated from the titles and abstracts of the papers in this site.