SUOD: Accelerating Large-Scale Unsupervised Heterogeneous Outlier
Detection
- URL: http://arxiv.org/abs/2003.05731v4
- Date: Fri, 5 Mar 2021 01:55:27 GMT
- Title: SUOD: Accelerating Large-Scale Unsupervised Heterogeneous Outlier
Detection
- Authors: Yue Zhao, Xiyang Hu, Cheng Cheng, Cong Wang, Changlin Wan, Wen Wang,
Jianing Yang, Haoping Bai, Zheng Li, Cao Xiao, Yunlong Wang, Zhi Qiao, Jimeng
Sun, Leman Akoglu
- Abstract summary: Outlier detection (OD) is a key machine learning (ML) task for identifying abnormal objects from general samples.
We propose a modular acceleration system, called SUOD, to address it.
- Score: 63.253850875265115
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Outlier detection (OD) is a key machine learning (ML) task for identifying
abnormal objects from general samples with numerous high-stake applications
including fraud detection and intrusion detection. Due to the lack of ground
truth labels, practitioners often have to build a large number of unsupervised,
heterogeneous models (i.e., different algorithms with varying hyperparameters)
for further combination and analysis, rather than relying on a single model.
How to accelerate the training and scoring on new-coming samples by
outlyingness (referred as prediction throughout the paper) with a large number
of unsupervised, heterogeneous OD models? In this study, we propose a modular
acceleration system, called SUOD, to address it. The proposed system focuses on
three complementary acceleration aspects (data reduction for high-dimensional
data, approximation for costly models, and taskload imbalance optimization for
distributed environment), while maintaining performance accuracy. Extensive
experiments on more than 20 benchmark datasets demonstrate SUOD's effectiveness
in heterogeneous OD acceleration, along with a real-world deployment case on
fraudulent claim analysis at IQVIA, a leading healthcare firm. We open-source
SUOD for reproducibility and accessibility.
Related papers
- Leveraging Latent Diffusion Models for Training-Free In-Distribution Data Augmentation for Surface Defect Detection [9.784793380119806]
We introduce DIAG, a training-free Diffusion-based In-distribution Anomaly Generation pipeline for data augmentation.
Unlike conventional image generation techniques, we implement a human-in-the-loop pipeline, where domain experts provide multimodal guidance to the model.
We demonstrate the efficacy and versatility of DIAG with respect to state-of-the-art data augmentation approaches on the challenging KSDD2 dataset.
arXiv Detail & Related papers (2024-07-04T14:28:52Z) - COFT-AD: COntrastive Fine-Tuning for Few-Shot Anomaly Detection [19.946344683965425]
We propose a novel methodology to address the challenge of FSAD.
We employ a model pre-trained on a large source dataset to model weights.
We evaluate few-shot anomaly detection on on 3 controlled AD tasks and 4 real-world AD tasks to demonstrate the effectiveness of the proposed method.
arXiv Detail & Related papers (2024-02-29T09:48:19Z) - Semi-supervised Open-World Object Detection [74.95267079505145]
We introduce a more realistic formulation, named semi-supervised open-world detection (SS-OWOD)
We demonstrate that the performance of the state-of-the-art OWOD detector dramatically deteriorates in the proposed SS-OWOD setting.
Our experiments on 4 datasets including MS COCO, PASCAL, Objects365 and DOTA demonstrate the effectiveness of our approach.
arXiv Detail & Related papers (2024-02-25T07:12:51Z) - MLAD: A Unified Model for Multi-system Log Anomaly Detection [35.68387377240593]
We propose MLAD, a novel anomaly detection model that incorporates semantic relational reasoning across multiple systems.
Specifically, we employ Sentence-bert to capture the similarities between log sequences and convert them into highly-dimensional learnable semantic vectors.
We revamp the formulas of the Attention layer to discern the significance of each keyword in the sequence and model the overall distribution of the multi-system dataset.
arXiv Detail & Related papers (2024-01-15T12:51:13Z) - Open-Vocabulary Video Anomaly Detection [57.552523669351636]
Video anomaly detection (VAD) with weak supervision has achieved remarkable performance in utilizing video-level labels to discriminate whether a video frame is normal or abnormal.
Recent studies attempt to tackle a more realistic setting, open-set VAD, which aims to detect unseen anomalies given seen anomalies and normal videos.
This paper takes a step further and explores open-vocabulary video anomaly detection (OVVAD), in which we aim to leverage pre-trained large models to detect and categorize seen and unseen anomalies.
arXiv Detail & Related papers (2023-11-13T02:54:17Z) - LafitE: Latent Diffusion Model with Feature Editing for Unsupervised
Multi-class Anomaly Detection [12.596635603629725]
We develop a unified model to detect anomalies from objects belonging to multiple classes when only normal data is accessible.
We first explore the generative-based approach and investigate latent diffusion models for reconstruction.
We introduce a feature editing strategy that modifies the input feature space of the diffusion model to further alleviate identity shortcuts''
arXiv Detail & Related papers (2023-07-16T14:41:22Z) - DAE : Discriminatory Auto-Encoder for multivariate time-series anomaly
detection in air transportation [68.8204255655161]
We propose a novel anomaly detection model called Discriminatory Auto-Encoder (DAE)
It uses the baseline of a regular LSTM-based auto-encoder but with several decoders, each getting data of a specific flight phase.
Results show that the DAE achieves better results in both accuracy and speed of detection.
arXiv Detail & Related papers (2021-09-08T14:07:55Z) - Improving Variational Autoencoder based Out-of-Distribution Detection
for Embedded Real-time Applications [2.9327503320877457]
Out-of-distribution (OD) detection is an emerging approach to address the challenge of detecting out-of-distribution in real-time.
In this paper, we show how we can robustly detect hazardous motion around autonomous driving agents.
Our methods significantly improve detection capabilities of OoD factors to unique driving scenarios, 42% better than state-of-the-art approaches.
Our model also generalized near-perfectly, 97% better than the state-of-the-art across the real-world and simulation driving data sets experimented.
arXiv Detail & Related papers (2021-07-25T07:52:53Z) - Contextual-Bandit Anomaly Detection for IoT Data in Distributed
Hierarchical Edge Computing [65.78881372074983]
IoT devices can hardly afford complex deep neural networks (DNN) models, and offloading anomaly detection tasks to the cloud incurs long delay.
We propose and build a demo for an adaptive anomaly detection approach for distributed hierarchical edge computing (HEC) systems.
We show that our proposed approach significantly reduces detection delay without sacrificing accuracy, as compared to offloading detection tasks to the cloud.
arXiv Detail & Related papers (2020-04-15T06:13:33Z) - Unsupervised Anomaly Detection with Adversarial Mirrored AutoEncoders [51.691585766702744]
We propose a variant of Adversarial Autoencoder which uses a mirrored Wasserstein loss in the discriminator to enforce better semantic-level reconstruction.
We put forward an alternative measure of anomaly score to replace the reconstruction-based metric.
Our method outperforms the current state-of-the-art methods for anomaly detection on several OOD detection benchmarks.
arXiv Detail & Related papers (2020-03-24T08:26:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.