Benchmark for Out-of-Distribution Detection in Deep Reinforcement
Learning
- URL: http://arxiv.org/abs/2112.02694v1
- Date: Sun, 5 Dec 2021 22:21:11 GMT
- Title: Benchmark for Out-of-Distribution Detection in Deep Reinforcement
Learning
- Authors: Aaqib Parvez Mohammed, Matias Valdenegro-Toro
- Abstract summary: Reinforcement Learning (RL) based solutions are being adopted in a variety of domains including robotics, health care and industrial automation.
Most focus is given to when these solutions work well, but they fail when presented with out of distribution inputs.
Out of distribution detection for RL is generally not well covered in the literature, and there is a lack of benchmarks for this task.
- Score: 0.0
- License: http://creativecommons.org/publicdomain/zero/1.0/
- Abstract: Reinforcement Learning (RL) based solutions are being adopted in a variety of
domains including robotics, health care and industrial automation. Most focus
is given to when these solutions work well, but they fail when presented with
out of distribution inputs. RL policies share the same faults as most machine
learning models. Out of distribution detection for RL is generally not well
covered in the literature, and there is a lack of benchmarks for this task. In
this work we propose a benchmark to evaluate OOD detection methods in a
Reinforcement Learning setting, by modifying the physical parameters of
non-visual standard environments or corrupting the state observation for visual
environments. We discuss ways to generate custom RL environments that can
produce OOD data, and evaluate three uncertainty methods for the OOD detection
task. Our results show that ensemble methods have the best OOD detection
performance with a lower standard deviation across multiple environments.
Related papers
- Dissecting Out-of-Distribution Detection and Open-Set Recognition: A Critical Analysis of Methods and Benchmarks [17.520137576423593]
We aim to provide a consolidated view of the two largest sub-fields within the community: out-of-distribution (OOD) detection and open-set recognition (OSR)
We perform rigorous cross-evaluation between state-of-the-art methods in the OOD detection and OSR settings and identify a strong correlation between the performances of methods for them.
We propose a new, large-scale benchmark setting which we suggest better disentangles the problem tackled by OOD detection and OSR.
arXiv Detail & Related papers (2024-08-29T17:55:07Z) - Rethinking Out-of-Distribution Detection for Reinforcement Learning: Advancing Methods for Evaluation and Detection [3.7384109981836158]
We study the problem of out-of-distribution (OOD) detection in reinforcement learning (RL)
We propose a clarification of terminology for OOD detection in RL, which aligns it with the literature from other machine learning domains.
We present new benchmark scenarios for OOD detection, which introduce anomalies with temporal autocorrelation into different components of the agent-environment loop.
We find that DEXTER can reliably identify anomalies across benchmark scenarios, exhibiting superior performance compared to both state-of-the-art OOD detectors and high-dimensional changepoint detectors adopted from statistics.
arXiv Detail & Related papers (2024-04-10T15:39:49Z) - Expecting The Unexpected: Towards Broad Out-Of-Distribution Detection [9.656342063882555]
We study five types of distribution shifts and evaluate the performance of recent OOD detection methods on each of them.
Our findings reveal that while these methods excel in detecting unknown classes, their performance is inconsistent when encountering other types of distribution shifts.
We present an ensemble approach that offers a more consistent and comprehensive solution for broad OOD detection.
arXiv Detail & Related papers (2023-08-22T14:52:44Z) - AUTO: Adaptive Outlier Optimization for Online Test-Time OOD Detection [81.49353397201887]
Out-of-distribution (OOD) detection is crucial to deploying machine learning models in open-world applications.
We introduce a novel paradigm called test-time OOD detection, which utilizes unlabeled online data directly at test time to improve OOD detection performance.
We propose adaptive outlier optimization (AUTO), which consists of an in-out-aware filter, an ID memory bank, and a semantically-consistent objective.
arXiv Detail & Related papers (2023-03-22T02:28:54Z) - Unsupervised Evaluation of Out-of-distribution Detection: A Data-centric
Perspective [55.45202687256175]
Out-of-distribution (OOD) detection methods assume that they have test ground truths, i.e., whether individual test samples are in-distribution (IND) or OOD.
In this paper, we are the first to introduce the unsupervised evaluation problem in OOD detection.
We propose three methods to compute Gscore as an unsupervised indicator of OOD detection performance.
arXiv Detail & Related papers (2023-02-16T13:34:35Z) - Pseudo-OOD training for robust language models [78.15712542481859]
OOD detection is a key component of a reliable machine-learning model for any industry-scale application.
We propose POORE - POsthoc pseudo-Ood REgularization, that generates pseudo-OOD samples using in-distribution (IND) data.
We extensively evaluate our framework on three real-world dialogue systems, achieving new state-of-the-art in OOD detection.
arXiv Detail & Related papers (2022-10-17T14:32:02Z) - Training OOD Detectors in their Natural Habitats [31.565635192716712]
Out-of-distribution (OOD) detection is important for machine learning models deployed in the wild.
Recent methods use auxiliary outlier data to regularize the model for improved OOD detection.
We propose a novel framework that leverages wild mixture data -- that naturally consists of both ID and OOD samples.
arXiv Detail & Related papers (2022-02-07T15:38:39Z) - Triggering Failures: Out-Of-Distribution detection by learning from
local adversarial attacks in Semantic Segmentation [76.2621758731288]
We tackle the detection of out-of-distribution (OOD) objects in semantic segmentation.
Our main contribution is a new OOD detection architecture called ObsNet associated with a dedicated training scheme based on Local Adversarial Attacks (LAA)
We show it obtains top performances both in speed and accuracy when compared to ten recent methods of the literature on three different datasets.
arXiv Detail & Related papers (2021-08-03T17:09:56Z) - Out-of-Distribution Dynamics Detection: RL-Relevant Benchmarks and
Results [21.054448068345348]
We study the problem of out-of-distribution dynamics (OODD) detection, which involves detecting when the dynamics of a temporal process change compared to the training-distribution dynamics.
This problem is particularly important in the context of deep RL, where learned controllers often overfit to the training environment.
Our first contribution is to design a set of OODD benchmarks derived from common RL environments with varying types and intensities of OODD.
Our second contribution is to design a strong OODD baseline approach based on recurrent implicit quantile networks (RIQNs), which monitors autoregressive prediction errors for OODD detection.
arXiv Detail & Related papers (2021-07-11T06:40:02Z) - Learn what you can't learn: Regularized Ensembles for Transductive
Out-of-distribution Detection [76.39067237772286]
We show that current out-of-distribution (OOD) detection algorithms for neural networks produce unsatisfactory results in a variety of OOD detection scenarios.
This paper studies how such "hard" OOD scenarios can benefit from adjusting the detection method after observing a batch of the test data.
We propose a novel method that uses an artificial labeling scheme for the test data and regularization to obtain ensembles of models that produce contradictory predictions only on the OOD samples in a test batch.
arXiv Detail & Related papers (2020-12-10T16:55:13Z) - Robust Out-of-distribution Detection for Neural Networks [51.19164318924997]
We show that existing detection mechanisms can be extremely brittle when evaluating on in-distribution and OOD inputs.
We propose an effective algorithm called ALOE, which performs robust training by exposing the model to both adversarially crafted inlier and outlier examples.
arXiv Detail & Related papers (2020-03-21T17:46:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.