Guaranteeing Out-Of-Distribution Detection in Deep RL via Transition Estimation
- URL: http://arxiv.org/abs/2503.05238v1
- Date: Fri, 07 Mar 2025 08:40:41 GMT
- Title: Guaranteeing Out-Of-Distribution Detection in Deep RL via Transition Estimation
- Authors: Mohit Prashant, Arvind Easwaran, Suman Das, Michael Yuhas,
- Abstract summary: Training environments may not reflect real-life environments.<n>Training systems are often equipped with out-of-distribution detectors that alert when a trained system encounters a state it does not recognize or in which it exhibits uncertainty.
- Score: 2.0836728378106883
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: An issue concerning the use of deep reinforcement learning (RL) agents is whether they can be trusted to perform reliably when deployed, as training environments may not reflect real-life environments. Anticipating instances outside their training scope, learning-enabled systems are often equipped with out-of-distribution (OOD) detectors that alert when a trained system encounters a state it does not recognize or in which it exhibits uncertainty. There exists limited work conducted on the problem of OOD detection within RL, with prior studies being unable to achieve a consensus on the definition of OOD execution within the context of RL. By framing our problem using a Markov Decision Process, we assume there is a transition distribution mapping each state-action pair to another state with some probability. Based on this, we consider the following definition of OOD execution within RL: A transition is OOD if its probability during real-life deployment differs from the transition distribution encountered during training. As such, we utilize conditional variational autoencoders (CVAE) to approximate the transition dynamics of the training environment and implement a conformity-based detector using reconstruction loss that is able to guarantee OOD detection with a pre-determined confidence level. We evaluate our detector by adapting existing benchmarks and compare it with existing OOD detection models for RL.
Related papers
- Advancing Out-of-Distribution Detection via Local Neuroplasticity [60.53625435889467]
This paper presents a novel OOD detection method that leverages the unique local neuroplasticity property of Kolmogorov-Arnold Networks (KANs)<n>Our method compares the activation patterns of a trained KAN against its untrained counterpart to detect OOD samples.<n>We validate our approach on benchmarks from image and medical domains, demonstrating superior performance and robustness compared to state-of-the-art techniques.
arXiv Detail & Related papers (2025-02-20T11:13:41Z) - Semantic or Covariate? A Study on the Intractable Case of Out-of-Distribution Detection [70.57120710151105]
We provide a more precise definition of the Semantic Space for the ID distribution.
We also define the "Tractable OOD" setting which ensures the distinguishability of OOD and ID distributions.
arXiv Detail & Related papers (2024-11-18T03:09:39Z) - Rethinking Out-of-Distribution Detection for Reinforcement Learning: Advancing Methods for Evaluation and Detection [3.7384109981836158]
We study the problem of out-of-distribution (OOD) detection in reinforcement learning (RL)
We propose a clarification of terminology for OOD detection in RL, which aligns it with the literature from other machine learning domains.
We present new benchmark scenarios for OOD detection, which introduce anomalies with temporal autocorrelation into different components of the agent-environment loop.
We find that DEXTER can reliably identify anomalies across benchmark scenarios, exhibiting superior performance compared to both state-of-the-art OOD detectors and high-dimensional changepoint detectors adopted from statistics.
arXiv Detail & Related papers (2024-04-10T15:39:49Z) - How to Enable Uncertainty Estimation in Proximal Policy Optimization [20.468991996052953]
Existing uncertainty estimation techniques have not seen widespread adoption in on-policy deep RL.
We propose definitions of uncertainty and OOD for Actor-Critic RL algorithms.
We show experimentally that the recently proposed method of Masksembles strikes a favourable balance among the survey methods.
arXiv Detail & Related papers (2022-10-07T15:56:59Z) - Breaking Down Out-of-Distribution Detection: Many Methods Based on OOD
Training Data Estimate a Combination of the Same Core Quantities [104.02531442035483]
The goal of this paper is to recognize common objectives as well as to identify the implicit scoring functions of different OOD detection methods.
We show that binary discrimination between in- and (different) out-distributions is equivalent to several distinct formulations of the OOD detection problem.
We also show that the confidence loss which is used by Outlier Exposure has an implicit scoring function which differs in a non-trivial fashion from the theoretically optimal scoring function.
arXiv Detail & Related papers (2022-06-20T16:32:49Z) - Benchmark for Out-of-Distribution Detection in Deep Reinforcement
Learning [0.0]
Reinforcement Learning (RL) based solutions are being adopted in a variety of domains including robotics, health care and industrial automation.
Most focus is given to when these solutions work well, but they fail when presented with out of distribution inputs.
Out of distribution detection for RL is generally not well covered in the literature, and there is a lack of benchmarks for this task.
arXiv Detail & Related papers (2021-12-05T22:21:11Z) - On the Practicality of Deterministic Epistemic Uncertainty [106.06571981780591]
deterministic uncertainty methods (DUMs) achieve strong performance on detecting out-of-distribution data.
It remains unclear whether DUMs are well calibrated and can seamlessly scale to real-world applications.
arXiv Detail & Related papers (2021-07-01T17:59:07Z) - Learn what you can't learn: Regularized Ensembles for Transductive
Out-of-distribution Detection [76.39067237772286]
We show that current out-of-distribution (OOD) detection algorithms for neural networks produce unsatisfactory results in a variety of OOD detection scenarios.
This paper studies how such "hard" OOD scenarios can benefit from adjusting the detection method after observing a batch of the test data.
We propose a novel method that uses an artificial labeling scheme for the test data and regularization to obtain ensembles of models that produce contradictory predictions only on the OOD samples in a test batch.
arXiv Detail & Related papers (2020-12-10T16:55:13Z) - Robust Out-of-distribution Detection for Neural Networks [51.19164318924997]
We show that existing detection mechanisms can be extremely brittle when evaluating on in-distribution and OOD inputs.
We propose an effective algorithm called ALOE, which performs robust training by exposing the model to both adversarially crafted inlier and outlier examples.
arXiv Detail & Related papers (2020-03-21T17:46:28Z) - Uncertainty-Based Out-of-Distribution Classification in Deep
Reinforcement Learning [17.10036674236381]
Wrong predictions for out-of-distribution data can cause safety critical situations in machine learning systems.
We propose a framework for uncertainty-based OOD classification: UBOOD.
We show that UBOOD produces reliable classification results when combined with ensemble-based estimators.
arXiv Detail & Related papers (2019-12-31T09:52:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.