COPOD: Copula-Based Outlier Detection
- URL: http://arxiv.org/abs/2009.09463v1
- Date: Sun, 20 Sep 2020 16:06:39 GMT
- Title: COPOD: Copula-Based Outlier Detection
- Authors: Zheng Li, Yue Zhao, Nicola Botta, Cezar Ionescu, Xiyang Hu
- Abstract summary: Outlier detection refers to the identification of rare items that are deviant from the general data distribution.
Existing approaches suffer from high computational complexity, low predictive capability, and limited interpretability.
We present a novel outlier detection algorithm called COPOD.
- Score: 7.963284082401154
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Outlier detection refers to the identification of rare items that are deviant
from the general data distribution. Existing approaches suffer from high
computational complexity, low predictive capability, and limited
interpretability. As a remedy, we present a novel outlier detection algorithm
called COPOD, which is inspired by copulas for modeling multivariate data
distribution. COPOD first constructs an empirical copula, and then uses it to
predict tail probabilities of each given data point to determine its level of
"extremeness". Intuitively, we think of this as calculating an anomalous
p-value. This makes COPOD both parameter-free, highly interpretable, and
computationally efficient. In this work, we make three key contributions, 1)
propose a novel, parameter-free outlier detection algorithm with both great
performance and interpretability, 2) perform extensive experiments on 30
benchmark datasets to show that COPOD outperforms in most cases and is also one
of the fastest algorithms, and 3) release an easy-to-use Python implementation
for reproducibility.
Related papers
- Efficiency of Unsupervised Anomaly Detection Methods on Software Logs [0.0]
This paper studies unsupervised and time efficient methods for anomaly detection.
The models are evaluated on four public datasets.
For speed, the OOV detector with word representation is optimal. For accuracy, the OOV detector combined with trigram representation yields the highest AUC-ROC (0.846)
arXiv Detail & Related papers (2023-12-04T14:44:31Z) - On the Universal Adversarial Perturbations for Efficient Data-free
Adversarial Detection [55.73320979733527]
We propose a data-agnostic adversarial detection framework, which induces different responses between normal and adversarial samples to UAPs.
Experimental results show that our method achieves competitive detection performance on various text classification tasks.
arXiv Detail & Related papers (2023-06-27T02:54:07Z) - AnoRand: A Semi Supervised Deep Learning Anomaly Detection Method by
Random Labeling [0.0]
Anomaly detection or more generally outliers detection is one of the most popular and challenging subject in theoretical and applied machine learning.
We present a new semi-supervised anomaly detection method called textbfAnoRand by combining a deep learning architecture with random synthetic label generation.
arXiv Detail & Related papers (2023-05-28T10:53:34Z) - Few-Shot Non-Parametric Learning with Deep Latent Variable Model [50.746273235463754]
We propose Non-Parametric learning by Compression with Latent Variables (NPC-LV)
NPC-LV is a learning framework for any dataset with abundant unlabeled data but very few labeled ones.
We show that NPC-LV outperforms supervised methods on all three datasets on image classification in low data regime.
arXiv Detail & Related papers (2022-06-23T09:35:03Z) - ECOD: Unsupervised Outlier Detection Using Empirical Cumulative
Distribution Functions [12.798256312657136]
Outlier detection refers to the identification of data points that deviate from a general data distribution.
We present ECOD (Empirical-Cumulative-distribution-based Outlier Detection), which is inspired by the fact that outliers are often the "rare events" that appear in the tails of a distribution.
arXiv Detail & Related papers (2022-01-02T17:28:35Z) - Learnable Locality-Sensitive Hashing for Video Anomaly Detection [44.19433917039249]
Video anomaly detection (VAD) mainly refers to identifying anomalous events that have not occurred in the training set where only normal samples are available.
We propose a novel distance-based VAD method to take advantage of all the available normal data efficiently and flexibly.
arXiv Detail & Related papers (2021-11-15T15:25:45Z) - Highly Parallel Autoregressive Entity Linking with Discriminative
Correction [51.947280241185]
We propose a very efficient approach that parallelizes autoregressive linking across all potential mentions.
Our model is >70 times faster and more accurate than the previous generative method.
arXiv Detail & Related papers (2021-09-08T17:28:26Z) - An algorithm-based multiple detection influence measure for high
dimensional regression using expectile [0.4999814847776096]
We propose an algorithm-based, multi-step, multiple detection procedure to identify influential observations.
Our three-step algorithm to identify and capture undesirable variability in the data, $asymMIP,$ is based on two complementary statistics.
The application of our method to the Autism Brain Imaging Data Exchange dataset resulted in a more balanced and accurate prediction of brain maturity.
arXiv Detail & Related papers (2021-05-26T01:16:24Z) - Sample and Computation Redistribution for Efficient Face Detection [137.19388513633484]
Training data sampling and computation distribution strategies are the keys to efficient and accurate face detection.
scrfdf34 outperforms the best competitor, TinaFace, by $3.86%$ (AP at hard set) while being more than emph3$times$ faster on GPUs with VGA-resolution images.
arXiv Detail & Related papers (2021-05-10T23:51:14Z) - State Entropy Maximization with Random Encoders for Efficient
Exploration [162.39202927681484]
Recent exploration methods have proven to be a recipe for improving sample-efficiency in deep reinforcement learning (RL)
This paper presents Randoms for Efficient Exploration (RE3), an exploration method that utilizes state entropy as an intrinsic reward.
In particular, we find that the state entropy can be estimated in a stable and compute-efficient manner by utilizing a randomly encoder.
arXiv Detail & Related papers (2021-02-18T15:45:17Z) - SUOD: Accelerating Large-Scale Unsupervised Heterogeneous Outlier
Detection [63.253850875265115]
Outlier detection (OD) is a key machine learning (ML) task for identifying abnormal objects from general samples.
We propose a modular acceleration system, called SUOD, to address it.
arXiv Detail & Related papers (2020-03-11T00:22:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.