A Non-Parametric Subspace Analysis Approach with Application to Anomaly
Detection Ensembles
- URL: http://arxiv.org/abs/2101.04932v1
- Date: Wed, 13 Jan 2021 08:23:01 GMT
- Title: A Non-Parametric Subspace Analysis Approach with Application to Anomaly
Detection Ensembles
- Authors: Marcelo Bacher, Irad Ben-Gal, Erez Shmueli
- Abstract summary: We propose a new subspace analysis approach named Agglomerative Attribute Grouping (AAG)
AAG relies on a novel multi-attribute measure, which is derived from information theory measures of partitions.
In the vast majority of cases, the proposed AAG method outperforms classical and state-of-the-art subspace analysis methods.
- Score: 8.533981186119068
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Identifying anomalies in multi-dimensional datasets is an important task in
many real-world applications. A special case arises when anomalies are occluded
in a small set of attributes, typically referred to as a subspace, and not
necessarily over the entire data space. In this paper, we propose a new
subspace analysis approach named Agglomerative Attribute Grouping (AAG) that
aims to address this challenge by searching for subspaces that are comprised of
highly correlative attributes. Such correlations among attributes represent a
systematic interaction among the attributes that can better reflect the
behavior of normal observations and hence can be used to improve the
identification of two particularly interesting types of abnormal data samples:
anomalies that are occluded in relatively small subsets of the attributes and
anomalies that represent a new data class. AAG relies on a novel
multi-attribute measure, which is derived from information theory measures of
partitions, for evaluating the "information distance" between groups of data
attributes. To determine the set of subspaces to use, AAG applies a variation
of the well-known agglomerative clustering algorithm with the proposed
multi-attribute measure as the underlying distance function. Finally, the set
of subspaces is used in an ensemble for anomaly detection. Extensive evaluation
demonstrates that, in the vast majority of cases, the proposed AAG method (i)
outperforms classical and state-of-the-art subspace analysis methods when used
in anomaly detection ensembles, and (ii) generates fewer subspaces with a fewer
number of attributes each (on average), thus resulting in a faster training
time for the anomaly detection ensemble. Furthermore, in contrast to existing
methods, the proposed AAG method does not require any tuning of parameters.
Related papers
- ARC: A Generalist Graph Anomaly Detector with In-Context Learning [62.202323209244]
ARC is a generalist GAD approach that enables a one-for-all'' GAD model to detect anomalies across various graph datasets on-the-fly.
equipped with in-context learning, ARC can directly extract dataset-specific patterns from the target dataset.
Extensive experiments on multiple benchmark datasets from various domains demonstrate the superior anomaly detection performance, efficiency, and generalizability of ARC.
arXiv Detail & Related papers (2024-05-27T02:42:33Z) - Domain Adaptive and Fine-grained Anomaly Detection for Single-cell Sequencing Data and Beyond [4.4136780724044735]
We present ACSleuth, a novel, reconstruction deviation-guided generative framework that integrates the detection, domain adaptation, and fine-grained annotating of anomalous cells into a methodologically cohesive workflow.
This analysis informs us to develop a novel and superior maximum mean discrepancy-based anomaly scorer in ACSleuth.
arXiv Detail & Related papers (2024-04-26T14:48:24Z) - Learning Feature Inversion for Multi-class Anomaly Detection under General-purpose COCO-AD Benchmark [101.23684938489413]
Anomaly detection (AD) is often focused on detecting anomalies for industrial quality inspection and medical lesion examination.
This work first constructs a large-scale and general-purpose COCO-AD dataset by extending COCO to the AD field.
Inspired by the metrics in the segmentation field, we propose several more practical threshold-dependent AD-specific metrics.
arXiv Detail & Related papers (2024-04-16T17:38:26Z) - Multi-Class Anomaly Detection based on Regularized Discriminative
Coupled hypersphere-based Feature Adaptation [85.15324009378344]
This paper introduces a new model by including class discriminative properties obtained by a modified Regularized Discriminative Variational Auto-Encoder (RD-VAE) in the feature extraction process.
The proposed Regularized Discriminative Coupled-hypersphere-based Feature Adaptation (RD-CFA) forms a solution for multi-class anomaly detection.
arXiv Detail & Related papers (2023-11-24T14:26:07Z) - Towards Interpretable Anomaly Detection via Invariant Rule Mining [2.538209532048867]
In this work, we pursue highly interpretable anomaly detection via invariant rule mining.
Specifically, we leverage decision tree learning and association rule mining to automatically generate invariant rules.
The generated invariant rules can provide explicit explanation of anomaly detection results and thus are extremely useful for subsequent decision-making.
arXiv Detail & Related papers (2022-11-24T13:03:20Z) - Causality-Based Multivariate Time Series Anomaly Detection [63.799474860969156]
We formulate the anomaly detection problem from a causal perspective and view anomalies as instances that do not follow the regular causal mechanism to generate the multivariate data.
We then propose a causality-based anomaly detection approach, which first learns the causal structure from data and then infers whether an instance is an anomaly relative to the local causal mechanism.
We evaluate our approach with both simulated and public datasets as well as a case study on real-world AIOps applications.
arXiv Detail & Related papers (2022-06-30T06:00:13Z) - Data-Efficient and Interpretable Tabular Anomaly Detection [54.15249463477813]
We propose a novel framework that adapts a white-box model class, Generalized Additive Models, to detect anomalies.
In addition, the proposed framework, DIAD, can incorporate a small amount of labeled data to further boost anomaly detection performances in semi-supervised settings.
arXiv Detail & Related papers (2022-03-03T22:02:56Z) - Dependency-based Anomaly Detection: a General Framework and Comprehensive Evaluation [33.31923133201812]
This paper introduces Dependency-based Anomaly Detection (DepAD)
DepAD reframes unsupervised anomaly detection as supervised feature selection and prediction tasks.
Two DepAD algorithms emerge as all-rounders and superior performers in handling a wide range of datasets.
arXiv Detail & Related papers (2020-11-13T01:39:44Z) - Toward Deep Supervised Anomaly Detection: Reinforcement Learning from
Partially Labeled Anomaly Data [150.9270911031327]
We consider the problem of anomaly detection with a small set of partially labeled anomaly examples and a large-scale unlabeled dataset.
Existing related methods either exclusively fit the limited anomaly examples that typically do not span the entire set of anomalies, or proceed with unsupervised learning from the unlabeled data.
We propose here instead a deep reinforcement learning-based approach that enables an end-to-end optimization of the detection of both labeled and unlabeled anomalies.
arXiv Detail & Related papers (2020-09-15T03:05:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.