Related papers: Automating Outlier Detection via Meta-Learning

Automating Outlier Detection via Meta-Learning

URL: http://arxiv.org/abs/2009.10606v2
Date: Wed, 17 Mar 2021 14:44:35 GMT
Title: Automating Outlier Detection via Meta-Learning
Authors: Yue Zhao, Ryan A. Rossi, Leman Akoglu
Abstract summary: We develop the first principled data-driven approach to model selection for outlier detection, called MetaOD, based on meta-learning. We show the effectiveness of MetaOD in selecting a detection model that significantly outperforms the most popular outlier detectors. To foster and further research on this new problem, we open-source our entire meta-learning system, benchmark environment, and testbed datasets.
Score: 37.736124230543865
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Given an unsupervised outlier detection (OD) task on a new dataset, how can we automatically select a good outlier detection method and its hyperparameter(s) (collectively called a model)? Thus far, model selection for OD has been a "black art"; as any model evaluation is infeasible due to the lack of (i) hold-out data with labels, and (ii) a universal objective function. In this work, we develop the first principled data-driven approach to model selection for OD, called MetaOD, based on meta-learning. MetaOD capitalizes on the past performances of a large body of detection models on existing outlier detection benchmark datasets, and carries over this prior experience to automatically select an effective model to be employed on a new dataset without using any labels. To capture task similarity, we introduce specialized meta-features that quantify outlying characteristics of a dataset. Through comprehensive experiments, we show the effectiveness of MetaOD in selecting a detection model that significantly outperforms the most popular outlier detectors (e.g., LOF and iForest) as well as various state-of-the-art unsupervised meta-learners while being extremely fast. To foster reproducibility and further research on this new problem, we open-source our entire meta-learning system, benchmark environment, and testbed datasets.

Related papers

Zero-Shot Image Anomaly Detection Using Generative Foundation Models [2.241618130319058]
This research explores the use of score-based generative models as foundational tools for semantic anomaly detection.<n>By analyzing Stein score errors, we introduce a novel method for identifying anomalous samples without requiring re-training on each target dataset.<n>Our approach improves over state-of-the-art and relies on training a single model on one dataset -- CelebA -- which we find to be an effective base distribution.
arXiv Detail & Related papers (2025-07-30T13:56:36Z)
UniOD: A Universal Model for Outlier Detection across Diverse Domains [22.653890395053207]
Outlier detection (OD) seeks to distinguish inliers and outliers in completely unlabeled datasets.<n>We propose UniOD, a universal OD framework that leverages labeled datasets to train a single model capable of detecting outliers.<n>We evaluate UniOD on 15 benchmark OD datasets against 15 state-of-the-art baselines, demonstrating its effectiveness.
arXiv Detail & Related papers (2025-07-09T07:52:12Z)
Attribute-to-Delete: Machine Unlearning via Datamodel Matching [65.13151619119782]
Machine unlearning -- efficiently removing a small "forget set" training data on a pre-divertrained machine learning model -- has recently attracted interest. Recent research shows that machine unlearning techniques do not hold up in such a challenging setting.
arXiv Detail & Related papers (2024-10-30T17:20:10Z)
Zero-shot Outlier Detection via Prior-data Fitted Networks: Model Selection Bygone! [28.823740273813296]
Outlier detection (OD) has numerous applications in environmental monitoring, cybersecurity, finance, and medicine. Being an inherently unsupervised task, model selection is a key bottleneck for OD without label supervision. We present FoMo-0D, for zero/0-shot OD exploring a transformative new direction that bypasses the hurdle of model selection altogether.
arXiv Detail & Related papers (2024-09-09T14:41:24Z)
IoTGeM: Generalizable Models for Behaviour-Based IoT Attack Detection [3.3772986620114387]
We present an approach for modelling IoT network attacks that focuses on generalizability, yet also leads to better detection and performance. First, we present an improved rolling window approach for feature extraction, and introduce a multi-step feature selection process that reduces overfitting. Second, we build and test models using isolated train and test datasets, thereby avoiding common data leaks. Third, we rigorously evaluate our methodology using a diverse portfolio of machine learning models, evaluation metrics and datasets.
arXiv Detail & Related papers (2023-10-17T21:46:43Z)
Aligning Data Selection with Performance: Performance-driven Reinforcement Learning for Active Learning in Object Detection [31.304039641225504]
This paper introduces Mean-AP Guided Reinforced Active Learning for Object Detection (MGRAL)<n>MGRAL is a novel approach that leverages the concept of expected model output changes as informativeness for deep detection networks.<n>Our approach demonstrates strong performance, establishing a new paradigm in reinforcement learning-based active learning for object detection.
arXiv Detail & Related papers (2023-10-12T14:59:22Z)
Unleashing Mask: Explore the Intrinsic Out-of-Distribution Detection Capability [70.72426887518517]
Out-of-distribution (OOD) detection is an indispensable aspect of secure AI when deploying machine learning models in real-world applications. We propose a novel method, Unleashing Mask, which aims to restore the OOD discriminative capabilities of the well-trained model with ID data. Our method utilizes a mask to figure out the memorized atypical samples, and then finetune the model or prune it with the introduced mask to forget them.
arXiv Detail & Related papers (2023-06-06T14:23:34Z)
Temporal Output Discrepancy for Loss Estimation-based Active Learning [65.93767110342502]
We present a novel deep active learning approach that queries the oracle for data annotation when the unlabeled sample is believed to incorporate high loss. Our approach achieves superior performances than the state-of-the-art active learning methods on image classification and semantic segmentation tasks.
arXiv Detail & Related papers (2022-12-20T19:29:37Z)
Toward Unsupervised Outlier Model Selection [20.12322454417006]
ELECT is a new approach to select an effective model on a new dataset without any labels. It is based on meta-learning; transferring prior knowledge (e.g. model performance) on historical datasets that are similar to the new one. It can serve an output on-demand, being able to accommodate varying time budgets.
arXiv Detail & Related papers (2022-11-03T14:14:46Z)
Meta-Learning for Unsupervised Outlier Detection with Optimal Transport [4.035753155957698]
We propose a novel approach to automate outlier detection based on meta-learning from previous datasets with outliers. We leverage optimal transport in particular, to find the dataset with the most similar underlying distribution, and then apply the outlier detection techniques that proved to work best for that data distribution.
arXiv Detail & Related papers (2022-11-01T10:36:48Z)
Meta-learning One-class Classifiers with Eigenvalue Solvers for Supervised Anomaly Detection [55.888835686183995]
We propose a neural network-based meta-learning method for supervised anomaly detection. We experimentally demonstrate that the proposed method achieves better performance than existing anomaly detection and few-shot learning methods.
arXiv Detail & Related papers (2021-03-01T01:43:04Z)
Meta-Regularization by Enforcing Mutual-Exclusiveness [0.8057006406834467]
We propose a regularization technique for meta-learning models that gives the model designer more control over the information flow during meta-training. Our proposed regularization function shows an accuracy boost of $sim$ $36%$ on the Omniglot dataset.
arXiv Detail & Related papers (2021-01-24T22:57:19Z)
Unsupervised Anomaly Detection with Adversarial Mirrored AutoEncoders [51.691585766702744]
We propose a variant of Adversarial Autoencoder which uses a mirrored Wasserstein loss in the discriminator to enforce better semantic-level reconstruction. We put forward an alternative measure of anomaly score to replace the reconstruction-based metric. Our method outperforms the current state-of-the-art methods for anomaly detection on several OOD detection benchmarks.
arXiv Detail & Related papers (2020-03-24T08:26:58Z)

This list is automatically generated from the titles and abstracts of the papers in this site.