Factor Analysis of Mixed Data for Anomaly Detection
- URL: http://arxiv.org/abs/2005.12129v1
- Date: Mon, 25 May 2020 14:13:10 GMT
- Title: Factor Analysis of Mixed Data for Anomaly Detection
- Authors: Matthew Davidow, David S. Matteson
- Abstract summary: Anomalous observations may correspond to financial fraud, health risks, or incorrectly measured data in practice.
We show detecting anomalies in high-dimensional mixed data is enhanced through first embedding the data then assessing an anomaly scoring scheme.
- Score: 5.77019633619109
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Anomaly detection aims to identify observations that deviate from the typical
pattern of data. Anomalous observations may correspond to financial fraud,
health risks, or incorrectly measured data in practice. We show detecting
anomalies in high-dimensional mixed data is enhanced through first embedding
the data then assessing an anomaly scoring scheme. We focus on unsupervised
detection and the continuous and categorical (mixed) variable case. We propose
a kurtosis-weighted Factor Analysis of Mixed Data for anomaly detection,
FAMDAD, to obtain a continuous embedding for anomaly scoring. We illustrate
that anomalies are highly separable in the first and last few ordered
dimensions of this space, and test various anomaly scoring experiments within
this subspace. Results are illustrated for both simulated and real datasets,
and the proposed approach (FAMDAD) is highly accurate for high-dimensional
mixed data throughout these diverse scenarios.
Related papers
- Anomaly Detection by Context Contrasting [57.695202846009714]
Anomaly detection focuses on identifying samples that deviate from the norm.
Recent advances in self-supervised learning have shown great promise in this regard.
We propose Con$$, which learns through context augmentations.
arXiv Detail & Related papers (2024-05-29T07:59:06Z) - AnomalyDiffusion: Few-Shot Anomaly Image Generation with Diffusion Model [59.08735812631131]
Anomaly inspection plays an important role in industrial manufacture.
Existing anomaly inspection methods are limited in their performance due to insufficient anomaly data.
We propose AnomalyDiffusion, a novel diffusion-based few-shot anomaly generation model.
arXiv Detail & Related papers (2023-12-10T05:13:40Z) - TabADM: Unsupervised Tabular Anomaly Detection with Diffusion Models [5.314466196448187]
We present a diffusion-based probabilistic model effective for unsupervised anomaly detection.
Our model is trained to learn the density of normal samples by utilizing a unique rejection scheme.
At inference, we identify anomalies as samples in low-density regions.
arXiv Detail & Related papers (2023-07-23T14:02:33Z) - AGAD: Adversarial Generative Anomaly Detection [12.68966318231776]
Anomaly detection suffered from the lack of anomalies due to the diversity of abnormalities and the difficulties of obtaining large-scale anomaly data.
We propose Adversarial Generative Anomaly Detection (AGAD), a self-contrast-based anomaly detection paradigm.
Our method generates pseudo-anomaly data for both supervised and semi-supervised anomaly detection scenarios.
arXiv Detail & Related papers (2023-04-09T10:40:02Z) - Catching Both Gray and Black Swans: Open-set Supervised Anomaly
Detection [90.32910087103744]
A few labeled anomaly examples are often available in many real-world applications.
These anomaly examples provide valuable knowledge about the application-specific abnormality.
Those anomalies seen during training often do not illustrate every possible class of anomaly.
This paper tackles open-set supervised anomaly detection.
arXiv Detail & Related papers (2022-03-28T05:21:37Z) - Explainable Deep Few-shot Anomaly Detection with Deviation Networks [123.46611927225963]
We introduce a novel weakly-supervised anomaly detection framework to train detection models.
The proposed approach learns discriminative normality by leveraging the labeled anomalies and a prior probability.
Our model is substantially more sample-efficient and robust, and performs significantly better than state-of-the-art competing methods in both closed-set and open-set settings.
arXiv Detail & Related papers (2021-08-01T14:33:17Z) - Understanding the Effect of Bias in Deep Anomaly Detection [15.83398707988473]
Anomaly detection presents a unique challenge in machine learning, due to the scarcity of labeled anomaly data.
Recent work attempts to mitigate such problems by augmenting training of deep anomaly detection models with additional labeled anomaly samples.
In this paper, we aim to understand the effect of a biased anomaly set on anomaly detection.
arXiv Detail & Related papers (2021-05-16T03:55:02Z) - Sub-clusters of Normal Data for Anomaly Detection [0.15229257192293197]
Anomaly detection in data analysis is an interesting but still challenging research topic in real world applications.
Existing anomaly detection methods show limited performances with high dimensional data such as ImageNet.
In this paper, we study anomaly detection with high dimensional and complex normal data.
arXiv Detail & Related papers (2020-11-17T03:53:31Z) - Toward Deep Supervised Anomaly Detection: Reinforcement Learning from
Partially Labeled Anomaly Data [150.9270911031327]
We consider the problem of anomaly detection with a small set of partially labeled anomaly examples and a large-scale unlabeled dataset.
Existing related methods either exclusively fit the limited anomaly examples that typically do not span the entire set of anomalies, or proceed with unsupervised learning from the unlabeled data.
We propose here instead a deep reinforcement learning-based approach that enables an end-to-end optimization of the detection of both labeled and unlabeled anomalies.
arXiv Detail & Related papers (2020-09-15T03:05:39Z) - Categorical anomaly detection in heterogeneous data using minimum
description length clustering [3.871148938060281]
We propose a meta-algorithm for enhancing any MDL-based anomaly detection model to deal with heterogeneous data.
Our experimental results show that using a discrete mixture model provides competitive performance relative to two previous anomaly detection algorithms.
arXiv Detail & Related papers (2020-06-14T14:48:37Z) - Deep Weakly-supervised Anomaly Detection [118.55172352231381]
Pairwise Relation prediction Network (PReNet) learns pairwise relation features and anomaly scores.
PReNet can detect any seen/unseen abnormalities that fit the learned pairwise abnormal patterns.
Empirical results on 12 real-world datasets show that PReNet significantly outperforms nine competing methods in detecting seen and unseen anomalies.
arXiv Detail & Related papers (2019-10-30T00:40:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.