Statistically Significant $k$NNAD by Selective Inference
- URL: http://arxiv.org/abs/2502.12978v1
- Date: Tue, 18 Feb 2025 15:58:58 GMT
- Title: Statistically Significant $k$NNAD by Selective Inference
- Authors: Mizuki Niihori, Teruyuki Katsuoka, Tomohiro Shiraishi, Shuichi Nishino, Ichiro Takeuchi,
- Abstract summary: A critical challenge in anomaly detection, including kNNAD, is appropriately quantifying the reliability of detected anomalies.
We formulate kNNAD as a statistical hypothesis test and quantify the probability of false detection using $p$-values.
By leveraging SI, the Stat-kNNAD method ensures that detected anomalies are statistically significant with theoretical guarantees.
- Score: 12.703556860454565
- License:
- Abstract: In this paper, we investigate the problem of unsupervised anomaly detection using the k-Nearest Neighbor method. The k-Nearest Neighbor Anomaly Detection (kNNAD) is a simple yet effective approach for identifying anomalies across various domains and fields. A critical challenge in anomaly detection, including kNNAD, is appropriately quantifying the reliability of detected anomalies. To address this, we formulate kNNAD as a statistical hypothesis test and quantify the probability of false detection using $p$-values. The main technical challenge lies in performing both anomaly detection and statistical testing on the same data, which hinders correct $p$-value calculation within the conventional statistical testing framework. To resolve this issue, we introduce a statistical hypothesis testing framework called Selective Inference (SI) and propose a method named Statistically Significant NNAD (Stat-kNNAD). By leveraging SI, the Stat-kNNAD method ensures that detected anomalies are statistically significant with theoretical guarantees. The proposed Stat-kNNAD method is applicable to anomaly detection in both the original feature space and latent feature spaces derived from deep learning models. Through numerical experiments on synthetic data and applications to industrial product anomaly detection, we demonstrate the validity and effectiveness of the Stat-kNNAD method.
Related papers
- Unsupervised Anomaly Detection Using Diffusion Trend Analysis [48.19821513256158]
We propose a method to detect anomalies by analysis of reconstruction trend depending on the degree of degradation.
The proposed method is validated on an open dataset for industrial anomaly detection.
arXiv Detail & Related papers (2024-07-12T01:50:07Z) - Leave-One-Out-, Bootstrap- and Cross-Conformal Anomaly Detectors [0.0]
In this work, we formally define and evaluate leave-one-out-, bootstrap-, and cross-conformal methods for anomaly detection.
We demonstrate that derived methods for calculating resampling-conformal $p$-values strike a practical compromise between statistical efficiency (full-conformal) and computational efficiency (split-conformal) as they make more efficient use of available data.
arXiv Detail & Related papers (2024-02-26T08:22:40Z) - Statistical Test on Diffusion Model-based Anomaly Detection by Selective Inference [19.927066428010782]
We address the task of detecting anomalous regions in medical images using diffusion models.
We propose a statistical method to quantify the reliability of the detected anomalies.
arXiv Detail & Related papers (2024-02-19T02:32:45Z) - Statistical Test for Anomaly Detections by Variational Auto-Encoders [19.927066428010782]
We consider the reliability assessment of anomaly detection using Variational Autoencoder (VAE)
Using the VAE-AD Test, the reliability of the anomaly regions detected by a VAE can be quantified in the form of p-values.
arXiv Detail & Related papers (2024-02-06T05:42:27Z) - Identifiability and Asymptotics in Learning Homogeneous Linear ODE Systems from Discrete Observations [114.17826109037048]
Ordinary Differential Equations (ODEs) have recently gained a lot of attention in machine learning.
theoretical aspects, e.g., identifiability and properties of statistical estimation are still obscure.
This paper derives a sufficient condition for the identifiability of homogeneous linear ODE systems from a sequence of equally-spaced error-free observations sampled from a single trajectory.
arXiv Detail & Related papers (2022-10-12T06:46:38Z) - Null Hypothesis Test for Anomaly Detection [0.0]
We extend the use of Classification Without Labels for anomaly detection with a hypothesis test designed to exclude the background-only hypothesis.
By testing for statistical independence of the two discriminating dataset regions, we are able exclude the background-only hypothesis without relying on fixed anomaly score cuts or extrapolations of background estimates between regions.
arXiv Detail & Related papers (2022-10-05T13:03:55Z) - Catching Both Gray and Black Swans: Open-set Supervised Anomaly
Detection [90.32910087103744]
A few labeled anomaly examples are often available in many real-world applications.
These anomaly examples provide valuable knowledge about the application-specific abnormality.
Those anomalies seen during training often do not illustrate every possible class of anomaly.
This paper tackles open-set supervised anomaly detection.
arXiv Detail & Related papers (2022-03-28T05:21:37Z) - Explainable Deep Few-shot Anomaly Detection with Deviation Networks [123.46611927225963]
We introduce a novel weakly-supervised anomaly detection framework to train detection models.
The proposed approach learns discriminative normality by leveraging the labeled anomalies and a prior probability.
Our model is substantially more sample-efficient and robust, and performs significantly better than state-of-the-art competing methods in both closed-set and open-set settings.
arXiv Detail & Related papers (2021-08-01T14:33:17Z) - Understanding the Effect of Bias in Deep Anomaly Detection [15.83398707988473]
Anomaly detection presents a unique challenge in machine learning, due to the scarcity of labeled anomaly data.
Recent work attempts to mitigate such problems by augmenting training of deep anomaly detection models with additional labeled anomaly samples.
In this paper, we aim to understand the effect of a biased anomaly set on anomaly detection.
arXiv Detail & Related papers (2021-05-16T03:55:02Z) - TadGAN: Time Series Anomaly Detection Using Generative Adversarial
Networks [73.01104041298031]
TadGAN is an unsupervised anomaly detection approach built on Generative Adversarial Networks (GANs)
To capture the temporal correlations of time series, we use LSTM Recurrent Neural Networks as base models for Generators and Critics.
To demonstrate the performance and generalizability of our approach, we test several anomaly scoring techniques and report the best-suited one.
arXiv Detail & Related papers (2020-09-16T15:52:04Z) - Density of States Estimation for Out-of-Distribution Detection [69.90130863160384]
DoSE is the density of states estimator.
We demonstrate DoSE's state-of-the-art performance against other unsupervised OOD detectors.
arXiv Detail & Related papers (2020-06-16T16:06:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.