Related papers: A Distance-based Anomaly Detection Framework for Deep Reinforcement Learning

A Distance-based Anomaly Detection Framework for Deep Reinforcement Learning

URL: http://arxiv.org/abs/2109.09889v3
Date: Fri, 18 Oct 2024 17:32:27 GMT
Title: A Distance-based Anomaly Detection Framework for Deep Reinforcement Learning
Authors: Hongming Zhang, Ke Sun, Bo Xu, Linglong Kong, Martin Müller,
Abstract summary: In deep reinforcement learning (RL) systems, abnormal states pose significant risks by potentially triggering unpredictable behaviors and unsafe actions. We propose a novel Mahalanobis distance-based (MD) anomaly detection framework, called textitMDX, for deep RL algorithms. MDX simultaneously addresses random, adversarial, and out-of-distribution (OOD) state outliers in both offline and online settings.
Score: 33.623558899286635
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: In deep reinforcement learning (RL) systems, abnormal states pose significant risks by potentially triggering unpredictable behaviors and unsafe actions, thus impeding the deployment of RL systems in real-world scenarios. It is crucial for reliable decision-making systems to have the capability to cast an alert whenever they encounter unfamiliar observations that they are not equipped to handle. In this paper, we propose a novel Mahalanobis distance-based (MD) anomaly detection framework, called \textit{MDX}, for deep RL algorithms. MDX simultaneously addresses random, adversarial, and out-of-distribution (OOD) state outliers in both offline and online settings. It utilizes Mahalanobis distance within class-conditional distributions for each action and operates within a statistical hypothesis testing framework under the Gaussian assumption. We further extend it to robust and distribution-free versions by incorporating Robust MD and conformal inference techniques. Through extensive experiments on classical control environments, Atari games, and autonomous driving scenarios, we demonstrate the effectiveness of our MD-based detection framework. MDX offers a simple, unified, and practical anomaly detection tool for enhancing the safety and reliability of RL systems in real-world applications.

Related papers

An Adversarial-Driven Experimental Study on Deep Learning for RF Fingerprinting [6.503988096115075]
Radio frequency (RF) fingerprinting has emerged as a promising physical-layer device identification mechanism.<n>Deep learning (DL) methods have demonstrated state-of-the-art performance in this domain.<n>We show that training DL models on raw received signals causes the models to entangle RF fingerprints with environmental and signal-pattern features.
arXiv Detail & Related papers (2025-07-18T17:42:20Z)
Anomalous Decision Discovery using Inverse Reinforcement Learning [3.3675535571071746]
Anomaly detection plays a critical role in Autonomous Vehicles (AVs) by identifying unusual behaviors through perception systems.<n>Current approaches, which often rely on predefined thresholds or supervised learning paradigms, exhibit reduced efficacy when confronted with unseen scenarios.<n>We present Trajectory-Reward Guided Adaptive Pre-training (TRAP), a novel IRL framework for anomaly detection.
arXiv Detail & Related papers (2025-07-06T17:01:02Z)
SCADE: Scalable Framework for Anomaly Detection in High-Performance System [0.0]
Command-line interfaces remain integral to high-performance computing environments. Traditional security solutions struggle to detect anomalies due to their context-specific nature, lack of labeled data, and the prevalence of sophisticated attacks like Living-off-the-Land (LOL) We introduce the Scalable Command-Line Anomaly Detection Engine (SCADE), a framework that combines global statistical models with local context-specific analysis for unsupervised anomaly detection.
arXiv Detail & Related papers (2024-12-05T15:39:13Z)
Scalable Offline Reinforcement Learning for Mean Field Games [6.8267158622784745]
Off-MMD is a novel mean-field RL algorithm that approximates equilibrium policies in mean-field games using purely offline data. Our algorithm scales to complex environments and demonstrates strong performance on benchmark tasks like crowd exploration or navigation.
arXiv Detail & Related papers (2024-10-23T14:16:34Z)
Learning-Based Shielding for Safe Autonomy under Unknown Dynamics [9.786577115501602]
Shielding is a method used to guarantee the safety of a system under a black-box controller. This paper proposes a data-driven shielding methodology that guarantees safety for unknown systems.
arXiv Detail & Related papers (2024-10-07T16:10:15Z)
HUWSOD: Holistic Self-training for Unified Weakly Supervised Object Detection [66.42229859018775]
We introduce a unified, high-capacity weakly supervised object detection (WSOD) network called HUWSOD. HUWSOD incorporates a self-supervised proposal generator and an autoencoder proposal generator with a multi-rate re-supervised pyramid to replace traditional object proposals. Our findings indicate that randomly boxes, although significantly different from well-designed offline object proposals, are effective for WSOD training.
arXiv Detail & Related papers (2024-06-27T17:59:49Z)
Analyzing Adversarial Inputs in Deep Reinforcement Learning [53.3760591018817]
We present a comprehensive analysis of the characterization of adversarial inputs, through the lens of formal verification. We introduce a novel metric, the Adversarial Rate, to classify models based on their susceptibility to such perturbations. Our analysis empirically demonstrates how adversarial inputs can affect the safety of a given DRL system with respect to such perturbations.
arXiv Detail & Related papers (2024-02-07T21:58:40Z)
Diminishing Empirical Risk Minimization for Unsupervised Anomaly Detection [0.0]
Empirical Risk Minimization (ERM) assumes that the performance of an algorithm on an unknown distribution can be approximated by averaging losses on the known training set. We propose a novel Diminishing Empirical Risk Minimization (DERM) framework to break through the limitations of ERM. DERM adaptively adjusts the impact of individual losses through a well-devised aggregation strategy.
arXiv Detail & Related papers (2022-05-29T14:18:26Z)
Inter-Domain Fusion for Enhanced Intrusion Detection in Power Systems: An Evidence Theoretic and Meta-Heuristic Approach [0.0]
False alerts due to/ compromised IDS in ICS networks can lead to severe economic and operational damage. This work presents an approach for reducing false alerts in CPS power systems by dealing with uncertainty without prior distribution of alerts.
arXiv Detail & Related papers (2021-11-20T00:05:39Z)
Learning Robust Output Control Barrier Functions from Safe Expert Demonstrations [50.37808220291108]
This paper addresses learning safe output feedback control laws from partial observations of expert demonstrations. We first propose robust output control barrier functions (ROCBFs) as a means to guarantee safety. We then formulate an optimization problem to learn ROCBFs from expert demonstrations that exhibit safe system behavior.
arXiv Detail & Related papers (2021-11-18T23:21:00Z)
Improving Variational Autoencoder based Out-of-Distribution Detection for Embedded Real-time Applications [2.9327503320877457]
Out-of-distribution (OD) detection is an emerging approach to address the challenge of detecting out-of-distribution in real-time. In this paper, we show how we can robustly detect hazardous motion around autonomous driving agents. Our methods significantly improve detection capabilities of OoD factors to unique driving scenarios, 42% better than state-of-the-art approaches. Our model also generalized near-perfectly, 97% better than the state-of-the-art across the real-world and simulation driving data sets experimented.
arXiv Detail & Related papers (2021-07-25T07:52:53Z)
Detecting Rewards Deterioration in Episodic Reinforcement Learning [63.49923393311052]
In many RL applications, once training ends, it is vital to detect any deterioration in the agent performance as soon as possible. We consider an episodic framework, where the rewards within each episode are not independent, nor identically-distributed, nor Markov. We define the mean-shift in a way corresponding to deterioration of a temporal signal (such as the rewards), and derive a test for this problem with optimal statistical power.
arXiv Detail & Related papers (2020-10-22T12:45:55Z)
Unsupervised Anomaly Detection with Adversarial Mirrored AutoEncoders [51.691585766702744]
We propose a variant of Adversarial Autoencoder which uses a mirrored Wasserstein loss in the discriminator to enforce better semantic-level reconstruction. We put forward an alternative measure of anomaly score to replace the reconstruction-based metric. Our method outperforms the current state-of-the-art methods for anomaly detection on several OOD detection benchmarks.
arXiv Detail & Related papers (2020-03-24T08:26:58Z)
SUOD: Accelerating Large-Scale Unsupervised Heterogeneous Outlier Detection [63.253850875265115]
Outlier detection (OD) is a key machine learning (ML) task for identifying abnormal objects from general samples. We propose a modular acceleration system, called SUOD, to address it.
arXiv Detail & Related papers (2020-03-11T00:22:50Z)

This list is automatically generated from the titles and abstracts of the papers in this site.