Generating Artificial Outliers in the Absence of Genuine Ones -- a
Survey
- URL: http://arxiv.org/abs/2006.03646v1
- Date: Fri, 5 Jun 2020 19:33:10 GMT
- Title: Generating Artificial Outliers in the Absence of Genuine Ones -- a
Survey
- Authors: Georg Steinbuss and Klemens B\"ohm
- Abstract summary: The literature features different approaches to generate artificial outliers.
We start by clarifying the terminology in the field, which varies from publication to publication.
We group the approaches by their general concepts and how they make use of genuine instances.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: By definition, outliers are rarely observed in reality, making them difficult
to detect or analyse. Artificial outliers approximate such genuine outliers and
can, for instance, help with the detection of genuine outliers or with
benchmarking outlier-detection algorithms. The literature features different
approaches to generate artificial outliers. However, systematic comparison of
these approaches remains absent. This surveys and compares these approaches. We
start by clarifying the terminology in the field, which varies from publication
to publication, and we propose a general problem formulation. Our description
of the connection of generating outliers to other research fields like
experimental design or generative models frames the field of artificial
outliers. Along with offering a concise description, we group the approaches by
their general concepts and how they make use of genuine instances. An extensive
experimental study reveals the differences between the generation approaches
when ultimately being used for outlier detection. This survey shows that the
existing approaches already cover a wide range of concepts underlying the
generation, but also that the field still has potential for further
development. Our experimental study does confirm the expectation that the
quality of the generation approaches varies widely, for example, in terms of
the data set they are used on. Ultimately, to guide the choice of the
generation approach in a specific context, we propose an appropriate
general-decision process. In summary, this survey comprises, describes, and
connects all relevant work regarding the generation of artificial outliers and
may serve as a basis to guide further research in the field.
Related papers
- Online Model-based Anomaly Detection in Multivariate Time Series: Taxonomy, Survey, Research Challenges and Future Directions [0.017476232824732776]
Time-series anomaly detection plays an important role in engineering processes.
This survey introduces a novel taxonomy where a distinction between online and offline, and training and inference is made.
It presents the most popular data sets and evaluation metrics used in the literature, as well as a detailed analysis.
arXiv Detail & Related papers (2024-08-07T13:01:10Z) - A Comprehensive Library for Benchmarking Multi-class Visual Anomaly Detection [52.228708947607636]
This paper introduces a comprehensive visual anomaly detection benchmark, ADer, which is a modular framework for new methods.
The benchmark includes multiple datasets from industrial and medical domains, implementing fifteen state-of-the-art methods and nine comprehensive metrics.
We objectively reveal the strengths and weaknesses of different methods and provide insights into the challenges and future directions of multi-class visual anomaly detection.
arXiv Detail & Related papers (2024-06-05T13:40:07Z) - Demystifying amortized causal discovery with transformers [21.058343547918053]
Supervised learning approaches for causal discovery from observational data often achieve competitive performance.
In this work, we investigate CSIvA, a transformer-based model promising to train on synthetic data and transfer to real data.
We bridge the gap with existing identifiability theory and show that constraints on the training data distribution implicitly define a prior on the test observations.
arXiv Detail & Related papers (2024-05-27T08:17:49Z) - Seeing Unseen: Discover Novel Biomedical Concepts via
Geometry-Constrained Probabilistic Modeling [53.7117640028211]
We present a geometry-constrained probabilistic modeling treatment to resolve the identified issues.
We incorporate a suite of critical geometric properties to impose proper constraints on the layout of constructed embedding space.
A spectral graph-theoretic method is devised to estimate the number of potential novel classes.
arXiv Detail & Related papers (2024-03-02T00:56:05Z) - Diversified Outlier Exposure for Out-of-Distribution Detection via
Informative Extrapolation [110.34982764201689]
Out-of-distribution (OOD) detection is important for deploying reliable machine learning models on real-world applications.
Recent advances in outlier exposure have shown promising results on OOD detection via fine-tuning model with informatively sampled auxiliary outliers.
We propose a novel framework, namely, Diversified Outlier Exposure (DivOE), for effective OOD detection via informative extrapolation based on the given auxiliary outliers.
arXiv Detail & Related papers (2023-10-21T07:16:09Z) - OpenOOD: Benchmarking Generalized Out-of-Distribution Detection [60.13300701826931]
Out-of-distribution (OOD) detection is vital to safety-critical machine learning applications.
The field currently lacks a unified, strictly formulated, and comprehensive benchmark.
We build a unified, well-structured called OpenOOD, which implements over 30 methods developed in relevant fields.
arXiv Detail & Related papers (2022-10-13T17:59:57Z) - Estimation Contracts for Outlier-Robust Geometric Perception [25.105820975269506]
Outlier-robust estimation is a fundamental problem and has been extensively investigated by statisticians practitioners.
We provide conditions on the input under which modern estimation algorithms are guaranteed to recover an estimate close to the ground in the presence of outliers.
arXiv Detail & Related papers (2022-08-22T18:01:49Z) - Assaying Out-Of-Distribution Generalization in Transfer Learning [103.57862972967273]
We take a unified view of previous work, highlighting message discrepancies that we address empirically.
We fine-tune over 31k networks, from nine different architectures in the many- and few-shot setting.
arXiv Detail & Related papers (2022-07-19T12:52:33Z) - A Unified Survey on Anomaly, Novelty, Open-Set, and Out-of-Distribution
Detection: Solutions and Future Challenges [28.104112546546936]
Machine learning models often encounter samples that are diverged from the training distribution.
Despite having similar and shared concepts, out-of-distribution, open-set, and anomaly detection have been investigated independently.
This survey aims to provide a cross-domain and comprehensive review of numerous eminent works in respective areas.
arXiv Detail & Related papers (2021-10-26T22:05:31Z) - A Deep Variational Approach to Clustering Survival Data [5.871238645229228]
We introduce a novel probabilistic approach to cluster survival data in a variational deep clustering setting.
Our proposed method employs a deep generative model to uncover the underlying distribution of both the explanatory variables and the potentially censored survival times.
arXiv Detail & Related papers (2021-06-10T14:10:25Z) - Anomalous Example Detection in Deep Learning: A Survey [98.2295889723002]
This survey tries to provide a structured and comprehensive overview of the research on anomaly detection for Deep Learning applications.
We provide a taxonomy for existing techniques based on their underlying assumptions and adopted approaches.
We highlight the unsolved research challenges while applying anomaly detection techniques in DL systems and present some high-impact future research directions.
arXiv Detail & Related papers (2020-03-16T02:47:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.