Benchmarking Unsupervised Outlier Detection with Realistic Synthetic
Data
- URL: http://arxiv.org/abs/2004.06947v1
- Date: Wed, 15 Apr 2020 08:55:47 GMT
- Title: Benchmarking Unsupervised Outlier Detection with Realistic Synthetic
Data
- Authors: Georg Steinbuss and Klemens B\"ohm
- Abstract summary: Benchmarking unsupervised outlier detection is difficult.
We propose a generic process for the generation of data sets for such benchmarking.
We describe three instantiations of the generic process that generate outliers with specific characteristics.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Benchmarking unsupervised outlier detection is difficult. Outliers are rare,
and existing benchmark data contains outliers with various and unknown
characteristics. Fully synthetic data usually consists of outliers and regular
instance with clear characteristics and thus allows for a more meaningful
evaluation of detection methods in principle. Nonetheless, there have only been
few attempts to include synthetic data in benchmarks for outlier detection.
This might be due to the imprecise notion of outliers or to the difficulty to
arrive at a good coverage of different domains with synthetic data. In this
work we propose a generic process for the generation of data sets for such
benchmarking. The core idea is to reconstruct regular instances from existing
real-world benchmark data while generating outliers so that they exhibit
insightful characteristics. This allows both for a good coverage of domains and
for helpful interpretations of results. We also describe three instantiations
of the generic process that generate outliers with specific characteristics,
like local outliers. A benchmark with state-of-the-art detection methods
confirms that our generic process is indeed practical.
Related papers
- Regularized Contrastive Partial Multi-view Outlier Detection [76.77036536484114]
We propose a novel method named Regularized Contrastive Partial Multi-view Outlier Detection (RCPMOD)
In this framework, we utilize contrastive learning to learn view-consistent information and distinguish outliers by the degree of consistency.
Experimental results on four benchmark datasets demonstrate that our proposed approach could outperform state-of-the-art competitors.
arXiv Detail & Related papers (2024-08-02T14:34:27Z) - A Discrepancy Aware Framework for Robust Anomaly Detection [51.710249807397695]
We present a Discrepancy Aware Framework (DAF), which demonstrates robust performance consistently with simple and cheap strategies.
Our method leverages an appearance-agnostic cue to guide the decoder in identifying defects, thereby alleviating its reliance on synthetic appearance.
Under the simple synthesis strategies, it outperforms existing methods by a large margin. Furthermore, it also achieves the state-of-the-art localization performance.
arXiv Detail & Related papers (2023-10-11T15:21:40Z) - Late Stopping: Avoiding Confidently Learning from Mislabeled Examples [61.00103151680946]
We propose a new framework, Late Stopping, which leverages the intrinsic robust learning ability of DNNs through a prolonged training process.
We empirically observe that mislabeled and clean examples exhibit differences in the number of epochs required for them to be consistently and correctly classified.
Experimental results on benchmark-simulated and real-world noisy datasets demonstrate that the proposed method outperforms state-of-the-art counterparts.
arXiv Detail & Related papers (2023-08-26T12:43:25Z) - DeepfakeBench: A Comprehensive Benchmark of Deepfake Detection [55.70982767084996]
A critical yet frequently overlooked challenge in the field of deepfake detection is the lack of a standardized, unified, comprehensive benchmark.
We present the first comprehensive benchmark for deepfake detection, called DeepfakeBench, which offers three key contributions.
DeepfakeBench contains 15 state-of-the-art detection methods, 9CL datasets, a series of deepfake detection evaluation protocols and analysis tools, as well as comprehensive evaluations.
arXiv Detail & Related papers (2023-07-04T01:34:41Z) - On the Universal Adversarial Perturbations for Efficient Data-free
Adversarial Detection [55.73320979733527]
We propose a data-agnostic adversarial detection framework, which induces different responses between normal and adversarial samples to UAPs.
Experimental results show that our method achieves competitive detection performance on various text classification tasks.
arXiv Detail & Related papers (2023-06-27T02:54:07Z) - Normalizing Flow based Feature Synthesis for Outlier-Aware Object
Detection [8.249143014271887]
General-purpose object detectors like Faster R-CNN are prone to providing overconfident predictions for outlier objects.
We propose a novel outlier-aware object detection framework that distinguishes outliers from inlier objects.
Our approach significantly outperforms the state-of-the-art for outlier-aware object detection on both image and video datasets.
arXiv Detail & Related papers (2023-02-01T13:12:00Z) - C-AllOut: Catching & Calling Outliers by Type [10.69970450827617]
C-AllOut is a novel outlier detector that annotates outliers by type.
It is parameter-free and scalable, besides working only with pairwise similarities (or distances) when it is needed.
arXiv Detail & Related papers (2021-10-13T14:25:52Z) - Unsupervised Outlier Detection using Memory and Contrastive Learning [53.77693158251706]
We think outlier detection can be done in the feature space by measuring the feature distance between outliers and inliers.
We propose a framework, MCOD, using a memory module and a contrastive learning module.
Our proposed MCOD achieves a considerable performance and outperforms nine state-of-the-art methods.
arXiv Detail & Related papers (2021-07-27T07:35:42Z) - Homophily Outlier Detection in Non-IID Categorical Data [43.51919113927003]
This work introduces a novel outlier detection framework and its two instances to identify outliers in categorical data.
It first defines and incorporates distribution-sensitive outlier factors and their interdependence into a value-value graph-based representation.
The learned value outlierness allows for either direct outlier detection or outlying feature selection.
arXiv Detail & Related papers (2021-03-21T23:29:33Z) - Anomaly Detection based on Zero-Shot Outlier Synthesis and Hierarchical
Feature Distillation [2.580765958706854]
Synthetically generated anomalies are a solution to such ill or not fully defined data.
We propose a two-level hierarchical latent space representation that distills inliers' feature-descriptors.
We select those that lie on the outskirts of the training data as synthetic-outlier generators.
arXiv Detail & Related papers (2020-10-10T23:34:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.