Clustered Hierarchical Anomaly and Outlier Detection Algorithms
- URL: http://arxiv.org/abs/2103.11774v1
- Date: Tue, 9 Feb 2021 15:27:52 GMT
- Title: Clustered Hierarchical Anomaly and Outlier Detection Algorithms
- Authors: Najib Ishaq, Thomas J. Howard III, Noah M. Daniels
- Abstract summary: We present CLAM, a fast hierarchical clustering technique that learns a manifold in a Banach space defined by a distance metric.
On 24 publicly available datasets, we compare the performance of CHAODA to a variety of state-of-the-art unsupervised anomaly-detection algorithms.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Anomaly and outlier detection in datasets is a long-standing problem in
machine learning. In some cases, anomaly detection is easy, such as when data
are drawn from well-characterized distributions such as the Gaussian. However,
when data occupy high-dimensional spaces, anomaly detection becomes more
difficult. We present CLAM (Clustered Learning of Approximate Manifolds), a
fast hierarchical clustering technique that learns a manifold in a Banach space
defined by a distance metric. CLAM induces a graph from the cluster tree, based
on overlapping clusters determined by several geometric and topological
features. On these graphs, we implement CHAODA (Clustered Hierarchical Anomaly
and Outlier Detection Algorithms), exploring various properties of the graphs
and their constituent clusters to compute scores of anomalousness. On 24
publicly available datasets, we compare the performance of CHAODA (by measure
of ROC AUC) to a variety of state-of-the-art unsupervised anomaly-detection
algorithms. Six of the datasets are used for training. CHAODA outperforms other
approaches on 14 of the remaining 18 datasets.
Related papers
- ARC: A Generalist Graph Anomaly Detector with In-Context Learning [62.202323209244]
ARC is a generalist GAD approach that enables a one-for-all'' GAD model to detect anomalies across various graph datasets on-the-fly.
equipped with in-context learning, ARC can directly extract dataset-specific patterns from the target dataset.
Extensive experiments on multiple benchmark datasets from various domains demonstrate the superior anomaly detection performance, efficiency, and generalizability of ARC.
arXiv Detail & Related papers (2024-05-27T02:42:33Z) - DeepHYDRA: Resource-Efficient Time-Series Anomaly Detection in Dynamically-Configured Systems [3.44012349879073]
We present DeepHYDRA (Deep Hybrid DBSCAN/Reduction-Based Anomaly Detection)
It combines DBSCAN and learning-based anomaly detection.
It is shown to reliably detect different types of anomalies in both large and complex datasets.
arXiv Detail & Related papers (2024-05-13T13:47:15Z) - Multi-Class Deep SVDD: Anomaly Detection Approach in Astronomy with
Distinct Inlier Categories [46.34797489552547]
We propose Multi-Class Deep Support Vector Data Description (MCDSVDD) to handle different inlier categories with distinct data distributions.
MCDSVDD uses a neural network to map the data into hyperspheres, where each hypersphere represents a specific inlier category.
Our results demonstrate the efficacy of MCDSVDD in detecting anomalous sources while leveraging the presence of different inlier categories.
arXiv Detail & Related papers (2023-08-09T15:10:53Z) - Rethinking k-means from manifold learning perspective [122.38667613245151]
We present a new clustering algorithm which directly detects clusters of data without mean estimation.
Specifically, we construct distance matrix between data points by Butterworth filter.
To well exploit the complementary information embedded in different views, we leverage the tensor Schatten p-norm regularization.
arXiv Detail & Related papers (2023-05-12T03:01:41Z) - Unsupervised anomaly detection algorithms on real-world data: how many
do we need? [1.4610038284393165]
This study is the largest comparison of unsupervised anomaly detection algorithms to date.
On the local datasets the $k$NN ($k$-nearest neighbor) algorithm comes out on top.
On the global datasets the EIF (extended isolation forest) algorithm performs the best.
arXiv Detail & Related papers (2023-05-01T09:27:42Z) - ARISE: Graph Anomaly Detection on Attributed Networks via Substructure
Awareness [70.60721571429784]
We propose a new graph anomaly detection framework on attributed networks via substructure awareness (ARISE)
ARISE focuses on the substructures in the graph to discern abnormalities.
Experiments show that ARISE greatly improves detection performance compared to state-of-the-art attributed networks anomaly detection (ANAD) algorithms.
arXiv Detail & Related papers (2022-11-28T12:17:40Z) - Semi-Supervised Domain Adaptation for Cross-Survey Galaxy Morphology
Classification and Anomaly Detection [57.85347204640585]
We develop a Universal Domain Adaptation method DeepAstroUDA.
It can be applied to datasets with different types of class overlap.
For the first time, we demonstrate the successful use of domain adaptation on two very different observational datasets.
arXiv Detail & Related papers (2022-11-01T18:07:21Z) - TadGAN: Time Series Anomaly Detection Using Generative Adversarial
Networks [73.01104041298031]
TadGAN is an unsupervised anomaly detection approach built on Generative Adversarial Networks (GANs)
To capture the temporal correlations of time series, we use LSTM Recurrent Neural Networks as base models for Generators and Critics.
To demonstrate the performance and generalizability of our approach, we test several anomaly scoring techniques and report the best-suited one.
arXiv Detail & Related papers (2020-09-16T15:52:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.