Finding Pegasus: Enhancing Unsupervised Anomaly Detection in High-Dimensional Data using a Manifold-Based Approach
- URL: http://arxiv.org/abs/2502.04310v1
- Date: Thu, 06 Feb 2025 18:53:30 GMT
- Title: Finding Pegasus: Enhancing Unsupervised Anomaly Detection in High-Dimensional Data using a Manifold-Based Approach
- Authors: R. P. Nathan, Nikolaos Nikolaou, Ofer Lahav,
- Abstract summary: We present an idealised illustration, "Finding Pegasus", and a novel formal framework with which we categorise unsupervised anomaly detection methods.
We then use this insight to develop an approach of combining AD methods which significantly boosts AD recall without sacrificing precision in situations employing high DR.
- Score: 0.0
- License:
- Abstract: Unsupervised machine learning methods are well suited to searching for anomalies at scale but can struggle with the high-dimensional representation of many modern datasets, hence dimensionality reduction (DR) is often performed first. In this paper we analyse unsupervised anomaly detection (AD) from the perspective of the manifold created in DR. We present an idealised illustration, "Finding Pegasus", and a novel formal framework with which we categorise AD methods and their results into "on manifold" and "off manifold". We define these terms and show how they differ. We then use this insight to develop an approach of combining AD methods which significantly boosts AD recall without sacrificing precision in situations employing high DR. When tested on MNIST data, our approach of combining AD methods improves recall by as much as 16 percent compared with simply combining with the best standalone AD method (Isolation Forest), a result which shows great promise for its application to real-world data.
Related papers
- One-to-Normal: Anomaly Personalization for Few-shot Anomaly Detection [15.782992908061736]
We introduce the anomaly personalization method, which performs a personalized one-to-normal transformation of query images.
We also propose a triplet contrastive anomaly inference strategy, which incorporates a comprehensive comparison between the query and generated anomaly-free data pool and prompt information.
Our method has been proven to transfer flexibly to other AD methods, with the generated image data effectively improving the performance of other AD methods.
arXiv Detail & Related papers (2025-02-03T09:49:01Z) - Learning Feature Inversion for Multi-class Anomaly Detection under General-purpose COCO-AD Benchmark [101.23684938489413]
Anomaly detection (AD) is often focused on detecting anomalies for industrial quality inspection and medical lesion examination.
This work first constructs a large-scale and general-purpose COCO-AD dataset by extending COCO to the AD field.
Inspired by the metrics in the segmentation field, we propose several more practical threshold-dependent AD-specific metrics.
arXiv Detail & Related papers (2024-04-16T17:38:26Z) - Don't Miss Out on Novelty: Importance of Novel Features for Deep Anomaly
Detection [64.21963650519312]
Anomaly Detection (AD) is a critical task that involves identifying observations that do not conform to a learned model of normality.
We propose a novel approach to AD using explainability to capture such novel features as unexplained observations in the input space.
Our approach establishes a new state-of-the-art across multiple benchmarks, handling diverse anomaly types.
arXiv Detail & Related papers (2023-10-01T21:24:05Z) - UADB: Unsupervised Anomaly Detection Booster [29.831918685340433]
Unsupervised Anomaly Detection (UAD) is a key data mining problem owing to its wide real-world applications.
No single assumption can describe such complexity and be valid in all scenarios.
We propose a general UAD Booster (UADB) that empowers any UAD models with adaptability to different data.
arXiv Detail & Related papers (2023-06-03T04:16:31Z) - On the Connection of Generative Models and Discriminative Models for
Anomaly Detection [3.9072109732275084]
We propose a new perspective on the ideal performance of GM-based AD methods.
In order to bypass the implicit assumption in the GMM-based AD method, we suggest integrating the Discriminative idea to orient GMM to AD tasks.
arXiv Detail & Related papers (2022-11-16T13:42:01Z) - Data-Efficient and Interpretable Tabular Anomaly Detection [54.15249463477813]
We propose a novel framework that adapts a white-box model class, Generalized Additive Models, to detect anomalies.
In addition, the proposed framework, DIAD, can incorporate a small amount of labeled data to further boost anomaly detection performances in semi-supervised settings.
arXiv Detail & Related papers (2022-03-03T22:02:56Z) - Hard-label Manifolds: Unexpected Advantages of Query Efficiency for
Finding On-manifold Adversarial Examples [67.23103682776049]
Recent zeroth order hard-label attacks on image classification models have shown comparable performance to their first-order, gradient-level alternatives.
It was recently shown in the gradient-level setting that regular adversarial examples leave the data manifold, while their on-manifold counterparts are in fact generalization errors.
We propose an information-theoretic argument based on a noisy manifold distance oracle, which leaks manifold information through the adversary's gradient estimate.
arXiv Detail & Related papers (2021-03-04T20:53:06Z) - Unsupervised Anomaly Detection with Adversarial Mirrored AutoEncoders [51.691585766702744]
We propose a variant of Adversarial Autoencoder which uses a mirrored Wasserstein loss in the discriminator to enforce better semantic-level reconstruction.
We put forward an alternative measure of anomaly score to replace the reconstruction-based metric.
Our method outperforms the current state-of-the-art methods for anomaly detection on several OOD detection benchmarks.
arXiv Detail & Related papers (2020-03-24T08:26:58Z) - Simple and Effective Prevention of Mode Collapse in Deep One-Class
Classification [93.2334223970488]
We propose two regularizers to prevent hypersphere collapse in deep SVDD.
The first regularizer is based on injecting random noise via the standard cross-entropy loss.
The second regularizer penalizes the minibatch variance when it becomes too small.
arXiv Detail & Related papers (2020-01-24T03:44:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.