OIL-AD: An Anomaly Detection Framework for Sequential Decision Sequences
- URL: http://arxiv.org/abs/2402.04567v1
- Date: Wed, 7 Feb 2024 04:06:53 GMT
- Title: OIL-AD: An Anomaly Detection Framework for Sequential Decision Sequences
- Authors: Chen Wang, Sarah Erfani, Tansu Alpcan, Christopher Leckie
- Abstract summary: We propose an unsupervised method named Offline Learning based Anomaly Detection (OIL-AD)
OIL-AD detects anomalies in decision-making sequences using two extracted behaviour features: action optimality and sequential association.
Our experiments show that OIL-AD can achieve outstanding online anomaly detection performance with up to 34.8% improvement in F1 score over comparable baselines.
- Score: 16.828732283348817
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Anomaly detection in decision-making sequences is a challenging problem due
to the complexity of normality representation learning and the sequential
nature of the task. Most existing methods based on Reinforcement Learning (RL)
are difficult to implement in the real world due to unrealistic assumptions,
such as having access to environment dynamics, reward signals, and online
interactions with the environment. To address these limitations, we propose an
unsupervised method named Offline Imitation Learning based Anomaly Detection
(OIL-AD), which detects anomalies in decision-making sequences using two
extracted behaviour features: action optimality and sequential association. Our
offline learning model is an adaptation of behavioural cloning with a
transformer policy network, where we modify the training process to learn a Q
function and a state value function from normal trajectories. We propose that
the Q function and the state value function can provide sufficient information
about agents' behavioural data, from which we derive two features for anomaly
detection. The intuition behind our method is that the action optimality
feature derived from the Q function can differentiate the optimal action from
others at each local state, and the sequential association feature derived from
the state value function has the potential to maintain the temporal
correlations between decisions (state-action pairs). Our experiments show that
OIL-AD can achieve outstanding online anomaly detection performance with up to
34.8% improvement in F1 score over comparable baselines.
Related papers
- Bidirectional Decoding: Improving Action Chunking via Closed-Loop Resampling [51.38330727868982]
Bidirectional Decoding (BID) is a test-time inference algorithm that bridges action chunking with closed-loop operations.
We show that BID boosts the performance of two state-of-the-art generative policies across seven simulation benchmarks and two real-world tasks.
arXiv Detail & Related papers (2024-08-30T15:39:34Z) - Exploring Test-Time Adaptation for Object Detection in Continually Changing Environments [13.163784646113214]
Continual Test-Time Adaptation (CTTA) has recently emerged as a promising technique to gradually adapt a source-trained model to continually changing target domains.
We present CTAOD, featuring three core components. Firstly, the object-level contrastive learning module extracts object-level features for contrastive learning to refine the feature representation in the target domain.
Secondly, the adaptive monitoring module dynamically skips unnecessary adaptation and updates the category-specific threshold based on predicted confidence scores to enable efficiency and improve the quality of pseudo-labels.
arXiv Detail & Related papers (2024-06-24T08:30:03Z) - Entropy-Regularized Token-Level Policy Optimization for Language Agent Reinforcement [67.1393112206885]
Large Language Models (LLMs) have shown promise as intelligent agents in interactive decision-making tasks.
We introduce Entropy-Regularized Token-level Policy Optimization (ETPO), an entropy-augmented RL method tailored for optimizing LLMs at the token level.
We assess the effectiveness of ETPO within a simulated environment that models data science code generation as a series of multi-step interactive tasks.
arXiv Detail & Related papers (2024-02-09T07:45:26Z) - Learning Generalizable Agents via Saliency-Guided Features Decorrelation [25.19044461705711]
We propose Saliency-Guided Features Decorrelation to eliminate correlations between features and decisions.
RFF is utilized to estimate the complex non-linear correlations in high-dimensional images, while the saliency map is designed to identify the changed features.
Under the guidance of the saliency map, SGFD employs sample reweighting to minimize the estimated correlations related to changed features.
arXiv Detail & Related papers (2023-10-08T09:24:43Z) - Interactive System-wise Anomaly Detection [66.3766756452743]
Anomaly detection plays a fundamental role in various applications.
It is challenging for existing methods to handle the scenarios where the instances are systems whose characteristics are not readily observed as data.
We develop an end-to-end approach which includes an encoder-decoder module that learns system embeddings.
arXiv Detail & Related papers (2023-04-21T02:20:24Z) - Robust Fitted-Q-Evaluation and Iteration under Sequentially Exogenous
Unobserved Confounders [16.193776814471768]
We study robust policy evaluation and policy optimization in the presence of sequentially-exogenous unobserved confounders.
We provide sample complexity bounds, insights, and show effectiveness both in simulations and on real-world longitudinal healthcare data of treating sepsis.
arXiv Detail & Related papers (2023-02-01T18:40:53Z) - Stabilizing Q-learning with Linear Architectures for Provably Efficient
Learning [53.17258888552998]
This work proposes an exploration variant of the basic $Q$-learning protocol with linear function approximation.
We show that the performance of the algorithm degrades very gracefully under a novel and more permissive notion of approximation error.
arXiv Detail & Related papers (2022-06-01T23:26:51Z) - Gleo-Det: Deep Convolution Feature-Guided Detector with Local Entropy
Optimization for Salient Points [5.955667705173262]
We propose to achieve fine constraint based on the requirement of repeatability while coarse constraint with guidance of deep convolution features.
With the guidance of convolution features, we define the cost function from both positive and negative sides.
arXiv Detail & Related papers (2022-04-27T12:40:21Z) - Data Augmentation through Expert-guided Symmetry Detection to Improve
Performance in Offline Reinforcement Learning [0.0]
offline estimation of the dynamical model of a Markov Decision Process (MDP) is a non-trivial task.
Recent works showed that an expert-guided pipeline relying on Density Estimation methods effectively detects this structure in deterministic environments.
We show that the former results lead to a performance improvement when solving the learned MDP and then applying the optimized policy in the real environment.
arXiv Detail & Related papers (2021-12-18T14:32:32Z) - Offline Reinforcement Learning with Implicit Q-Learning [85.62618088890787]
Current offline reinforcement learning methods need to query the value of unseen actions during training to improve the policy.
We propose an offline RL method that never needs to evaluate actions outside of the dataset.
This method enables the learned policy to improve substantially over the best behavior in the data through generalization.
arXiv Detail & Related papers (2021-10-12T17:05:05Z) - Dynamic Federated Learning [57.14673504239551]
Federated learning has emerged as an umbrella term for centralized coordination strategies in multi-agent environments.
We consider a federated learning model where at every iteration, a random subset of available agents perform local updates based on their data.
Under a non-stationary random walk model on the true minimizer for the aggregate optimization problem, we establish that the performance of the architecture is determined by three factors, namely, the data variability at each agent, the model variability across all agents, and a tracking term that is inversely proportional to the learning rate of the algorithm.
arXiv Detail & Related papers (2020-02-20T15:00:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.