On the Nature and Types of Anomalies: A Review of Deviations in Data
- URL: http://arxiv.org/abs/2007.15634v5
- Date: Mon, 29 May 2023 08:21:47 GMT
- Title: On the Nature and Types of Anomalies: A Review of Deviations in Data
- Authors: Ralph Foorthuis
- Abstract summary: This study offers the first theoretically principled and domain-independent typology of data anomalies.
To concretely define the concept of the anomaly and its different manifestations, the typology employs five dimensions.
These fundamental and data-centric dimensions naturally yield 3 broad groups, 9 basic types, and 63 subtypes of anomalies.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Anomalies are occurrences in a dataset that are in some way unusual and do
not fit the general patterns. The concept of the anomaly is typically
ill-defined and perceived as vague and domain-dependent. Moreover, despite some
250 years of publications on the topic, no comprehensive and concrete overviews
of the different types of anomalies have hitherto been published. By means of
an extensive literature review this study therefore offers the first
theoretically principled and domain-independent typology of data anomalies and
presents a full overview of anomaly types and subtypes. To concretely define
the concept of the anomaly and its different manifestations, the typology
employs five dimensions: data type, cardinality of relationship, anomaly level,
data structure, and data distribution. These fundamental and data-centric
dimensions naturally yield 3 broad groups, 9 basic types, and 63 subtypes of
anomalies. The typology facilitates the evaluation of the functional
capabilities of anomaly detection algorithms, contributes to explainable data
science, and provides insights into relevant topics such as local versus global
anomalies.
Related papers
- Prototypical Residual Networks for Anomaly Detection and Localization [80.5730594002466]
We propose a framework called Prototypical Residual Network (PRN)
PRN learns feature residuals of varying scales and sizes between anomalous and normal patterns to accurately reconstruct the segmentation maps of anomalous regions.
We present a variety of anomaly generation strategies that consider both seen and unseen appearance variance to enlarge and diversify anomalies.
arXiv Detail & Related papers (2022-12-05T05:03:46Z) - Deep Learning for Time Series Anomaly Detection: A Survey [53.83593870825628]
Time series anomaly detection has applications in a wide range of research fields and applications, including manufacturing and healthcare.
The large size and complex patterns of time series have led researchers to develop specialised deep learning models for detecting anomalous patterns.
This survey focuses on providing structured and comprehensive state-of-the-art time series anomaly detection models through the use of deep learning.
arXiv Detail & Related papers (2022-11-09T22:40:22Z) - Causality-Based Multivariate Time Series Anomaly Detection [63.799474860969156]
We formulate the anomaly detection problem from a causal perspective and view anomalies as instances that do not follow the regular causal mechanism to generate the multivariate data.
We then propose a causality-based anomaly detection approach, which first learns the causal structure from data and then infers whether an instance is an anomaly relative to the local causal mechanism.
We evaluate our approach with both simulated and public datasets as well as a case study on real-world AIOps applications.
arXiv Detail & Related papers (2022-06-30T06:00:13Z) - Catching Both Gray and Black Swans: Open-set Supervised Anomaly
Detection [90.32910087103744]
A few labeled anomaly examples are often available in many real-world applications.
These anomaly examples provide valuable knowledge about the application-specific abnormality.
Those anomalies seen during training often do not illustrate every possible class of anomaly.
This paper tackles open-set supervised anomaly detection.
arXiv Detail & Related papers (2022-03-28T05:21:37Z) - A Taxonomy of Anomalies in Log Data [0.09558392439655014]
A common taxonomy for anomalies already exists, but it has not yet been applied specifically to log data.
We present a taxonomy for different kinds of log data anomalies and introduce a method for analyzing such anomalies in labeled datasets.
Our results show, that the most common anomaly type is also the easiest to predict.
arXiv Detail & Related papers (2021-11-26T12:23:06Z) - Variation and generality in encoding of syntactic anomaly information in
sentence embeddings [7.132368785057315]
We explore fine-grained differences in anomaly encoding by designing probing tasks that vary the hierarchical level at which anomalies occur in a sentence.
We test not only models' ability to detect a given anomaly, but also the generality of the detected anomaly signal.
Results suggest that all models encode some information supporting anomaly detection, but detection performance varies between anomalies.
arXiv Detail & Related papers (2021-11-12T10:23:43Z) - A Typology of Data Anomalies [0.0]
Anomalies are cases that are in some way unusual and do not appear to fit the general patterns present in the dataset.
This paper introduces a general typology of anomalies that offers a clear and tangible definition of the different types of anomalies in datasets.
arXiv Detail & Related papers (2021-07-04T13:12:24Z) - Anomaly detection using principles of human perception [0.0]
Unsupervised anomaly detection algorithm is developed that is simple, real-time and parameter-free.
The idea is to assume anomalies are observations that are unexpected to occur with respect to certain groupings made by the majority of the data.
arXiv Detail & Related papers (2021-03-23T05:46:27Z) - Bias in ontologies -- a preliminary assessment [2.360534864805446]
Algorithmic bias is a well-known notion, but what does bias mean in the context of that provide a mechanism for an algorithm's input?
This characterisation aims contribute a sensitisation of ethical aspects of representation of information and knowledge.
arXiv Detail & Related papers (2021-01-20T09:28:08Z) - Toward Deep Supervised Anomaly Detection: Reinforcement Learning from
Partially Labeled Anomaly Data [150.9270911031327]
We consider the problem of anomaly detection with a small set of partially labeled anomaly examples and a large-scale unlabeled dataset.
Existing related methods either exclusively fit the limited anomaly examples that typically do not span the entire set of anomalies, or proceed with unsupervised learning from the unlabeled data.
We propose here instead a deep reinforcement learning-based approach that enables an end-to-end optimization of the detection of both labeled and unlabeled anomalies.
arXiv Detail & Related papers (2020-09-15T03:05:39Z) - Deep Weakly-supervised Anomaly Detection [118.55172352231381]
Pairwise Relation prediction Network (PReNet) learns pairwise relation features and anomaly scores.
PReNet can detect any seen/unseen abnormalities that fit the learned pairwise abnormal patterns.
Empirical results on 12 real-world datasets show that PReNet significantly outperforms nine competing methods in detecting seen and unseen anomalies.
arXiv Detail & Related papers (2019-10-30T00:40:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.