Anomalous Decision Discovery using Inverse Reinforcement Learning
- URL: http://arxiv.org/abs/2507.04464v1
- Date: Sun, 06 Jul 2025 17:01:02 GMT
- Title: Anomalous Decision Discovery using Inverse Reinforcement Learning
- Authors: Ashish Bastola, Mert D. Pesé, Long Cheng, Jonathon Smereka, Abolfazl Razi,
- Abstract summary: Anomaly detection plays a critical role in Autonomous Vehicles (AVs) by identifying unusual behaviors through perception systems.<n>Current approaches, which often rely on predefined thresholds or supervised learning paradigms, exhibit reduced efficacy when confronted with unseen scenarios.<n>We present Trajectory-Reward Guided Adaptive Pre-training (TRAP), a novel IRL framework for anomaly detection.
- Score: 3.3675535571071746
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Anomaly detection plays a critical role in Autonomous Vehicles (AVs) by identifying unusual behaviors through perception systems that could compromise safety and lead to hazardous situations. Current approaches, which often rely on predefined thresholds or supervised learning paradigms, exhibit reduced efficacy when confronted with unseen scenarios, sensor noise, and occlusions, leading to potential safety-critical failures. Moreover, supervised methods require large annotated datasets, limiting their real-world feasibility. To address these gaps, we propose an anomaly detection framework based on Inverse Reinforcement Learning (IRL) to infer latent driving intentions from sequential perception data, thus enabling robust identification. Specifically, we present Trajectory-Reward Guided Adaptive Pre-training (TRAP), a novel IRL framework for anomaly detection, to address two critical limitations of existing methods: noise robustness and generalization to unseen scenarios. Our core innovation is implicitly learning temporal credit assignments via reward and worst-case supervision. We leverage pre-training with variable-horizon sampling to maximize time-to-consequence, resulting in early detection of behavior deviation. Experiments on 14,000+ simulated trajectories demonstrate state-of-the-art performance, achieving 0.90 AUC and 82.2\% F1-score - outperforming similarly trained supervised and unsupervised baselines by 39\% on Recall and 12\% on F1-score, respectively. Similar performance is achieved while exhibiting robustness to various noise types and generalization to unseen anomaly types. Our code will be available at: https://github.com/abastola0/TRAP.git
Related papers
- Typicalness-Aware Learning for Failure Detection [26.23185979968123]
Deep neural networks (DNNs) often suffer from the overconfidence issue, where incorrect predictions are made with high confidence scores.
We propose a novel approach called Typicalness-Aware Learning (TAL) to address this issue and improve failure detection performance.
arXiv Detail & Related papers (2024-11-04T11:09:47Z) - Semi-supervised Open-World Object Detection [74.95267079505145]
We introduce a more realistic formulation, named semi-supervised open-world detection (SS-OWOD)
We demonstrate that the performance of the state-of-the-art OWOD detector dramatically deteriorates in the proposed SS-OWOD setting.
Our experiments on 4 datasets including MS COCO, PASCAL, Objects365 and DOTA demonstrate the effectiveness of our approach.
arXiv Detail & Related papers (2024-02-25T07:12:51Z) - OpenNet: Incremental Learning for Autonomous Driving Object Detection
with Balanced Loss [3.761247766448379]
The proposed method can obtain better performance than that of the existing methods.
The Experimental results upon the CODA dataset show that the proposed method can obtain better performance than that of the existing methods.
arXiv Detail & Related papers (2023-11-25T06:02:50Z) - CARLA: Self-supervised Contrastive Representation Learning for Time Series Anomaly Detection [53.83593870825628]
One main challenge in time series anomaly detection (TSAD) is the lack of labelled data in many real-life scenarios.
Most of the existing anomaly detection methods focus on learning the normal behaviour of unlabelled time series in an unsupervised manner.
We introduce a novel end-to-end self-supervised ContrAstive Representation Learning approach for time series anomaly detection.
arXiv Detail & Related papers (2023-08-18T04:45:56Z) - Realistic Safety-critical Scenarios Search for Autonomous Driving System
via Behavior Tree [8.286351881735191]
We propose the Matrix-Fuzzer, a behavior tree-based testing framework, to automatically generate realistic safety-critical test scenarios.
Our approach is able to find the most types of safety-critical scenarios, but only generating around 30% of the total scenarios compared with the baseline algorithm.
arXiv Detail & Related papers (2023-05-11T06:53:03Z) - Robust Semi-Supervised Anomaly Detection via Adversarially Learned
Continuous Noise Corruption [11.135527192198092]
Anomaly detection is the task of recognising novel samples which deviate significantly from pre-establishednormality.
Deep Autoencoders (AE) have been widely used foranomaly detection tasks, but suffer from overfitting to a null identity function.
We introduce an efficient methodof producing Adversarially Learned Continuous Noise (ALCN) to maximally globally corrupt the input prior to denoising.
arXiv Detail & Related papers (2023-03-02T22:59:20Z) - Self-Supervised Training with Autoencoders for Visual Anomaly Detection [61.62861063776813]
We focus on a specific use case in anomaly detection where the distribution of normal samples is supported by a lower-dimensional manifold.
We adapt a self-supervised learning regime that exploits discriminative information during training but focuses on the submanifold of normal examples.
We achieve a new state-of-the-art result on the MVTec AD dataset -- a challenging benchmark for visual anomaly detection in the manufacturing domain.
arXiv Detail & Related papers (2022-06-23T14:16:30Z) - Tracking the risk of a deployed model and detecting harmful distribution
shifts [105.27463615756733]
In practice, it may make sense to ignore benign shifts, under which the performance of a deployed model does not degrade substantially.
We argue that a sensible method for firing off a warning has to both (a) detect harmful shifts while ignoring benign ones, and (b) allow continuous monitoring of model performance without increasing the false alarm rate.
arXiv Detail & Related papers (2021-10-12T17:21:41Z) - Policy Smoothing for Provably Robust Reinforcement Learning [109.90239627115336]
We study the provable robustness of reinforcement learning against norm-bounded adversarial perturbations of the inputs.
We generate certificates that guarantee that the total reward obtained by the smoothed policy will not fall below a certain threshold under a norm-bounded adversarial of perturbation the input.
arXiv Detail & Related papers (2021-06-21T21:42:08Z) - Robust Pre-Training by Adversarial Contrastive Learning [120.33706897927391]
Recent work has shown that, when integrated with adversarial training, self-supervised pre-training can lead to state-of-the-art robustness.
We improve robustness-aware self-supervised pre-training by learning representations consistent under both data augmentations and adversarial perturbations.
arXiv Detail & Related papers (2020-10-26T04:44:43Z) - Detecting Rewards Deterioration in Episodic Reinforcement Learning [63.49923393311052]
In many RL applications, once training ends, it is vital to detect any deterioration in the agent performance as soon as possible.
We consider an episodic framework, where the rewards within each episode are not independent, nor identically-distributed, nor Markov.
We define the mean-shift in a way corresponding to deterioration of a temporal signal (such as the rewards), and derive a test for this problem with optimal statistical power.
arXiv Detail & Related papers (2020-10-22T12:45:55Z) - Sequential Anomaly Detection using Inverse Reinforcement Learning [23.554584457413483]
We propose an end-to-end framework for sequential anomaly detection using inverse reinforcement learning (IRL)
We use a neural network to represent a reward function. Using a learned reward function, we evaluate whether a new observation from the target agent follows a normal pattern.
The empirical study on publicly available real-world data shows that our proposed method is effective in identifying anomalies.
arXiv Detail & Related papers (2020-04-22T05:17:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.