Hyperparameter Sensitivity in Deep Outlier Detection: Analysis and a
Scalable Hyper-Ensemble Solution
- URL: http://arxiv.org/abs/2206.07647v1
- Date: Wed, 15 Jun 2022 16:46:00 GMT
- Title: Hyperparameter Sensitivity in Deep Outlier Detection: Analysis and a
Scalable Hyper-Ensemble Solution
- Authors: Xueying Ding, Lingxiao Zhao, Leman Akoglu
- Abstract summary: We conduct the first large-scale analysis on the HP sensitivity of deep OD methods.
We design a HP-robust and scalable deep hyper-ensemble model called ROBOD that assembles models with varying HP configurations.
- Score: 21.130842136324528
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Outlier detection (OD) literature exhibits numerous algorithms as it applies
to diverse domains. However, given a new detection task, it is unclear how to
choose an algorithm to use, nor how to set its hyperparameter(s) (HPs) in
unsupervised settings. HP tuning is an ever-growing problem with the arrival of
many new detectors based on deep learning. While they have appealing properties
such as task- driven representation learning and end-to-end optimization, deep
models come with a long list of HPs. Surprisingly, the issue of model selection
in the outlier mining literature has been "the elephant in the room"; a
significant factor in unlocking the utmost potential of deep methods, yet
little said or done to systematically tackle the issue. In the first part of
this paper, we conduct the first large-scale analysis on the HP sensitivity of
deep OD methods, and through more than 35,000 trained models, quantitatively
demonstrate that model selection is inevitable. Next, we design a HP-robust and
scalable deep hyper-ensemble model called ROBOD that assembles models with
varying HP configurations, bypassing the choice paralysis. Importantly, we
introduce novel strategies to speed up ensemble training, such as parameter
sharing, batch/simultaneous training, and data subsampling, that allow us to
train fewer models with fewer parameters. Extensive experiments on both image
and tabular datasets show that ROBOD achieves and retains robust,
state-of-the-art detection performance as compared to its modern counterparts,
while taking only 2-10% of the time by the naive hyper-ensemble with
independent training.
Related papers
- Open-Set Deepfake Detection: A Parameter-Efficient Adaptation Method with Forgery Style Mixture [58.60915132222421]
We introduce an approach that is both general and parameter-efficient for face forgery detection.
We design a forgery-style mixture formulation that augments the diversity of forgery source domains.
We show that the designed model achieves state-of-the-art generalizability with significantly reduced trainable parameters.
arXiv Detail & Related papers (2024-08-23T01:53:36Z) - SMILE: Zero-Shot Sparse Mixture of Low-Rank Experts Construction From Pre-Trained Foundation Models [85.67096251281191]
We present an innovative approach to model fusion called zero-shot Sparse MIxture of Low-rank Experts (SMILE) construction.
SMILE allows for the upscaling of source models into an MoE model without extra data or further training.
We conduct extensive experiments across diverse scenarios, such as image classification and text generation tasks, using full fine-tuning and LoRA fine-tuning.
arXiv Detail & Related papers (2024-08-19T17:32:15Z) - Towards Efficient Pareto Set Approximation via Mixture of Experts Based Model Fusion [53.33473557562837]
Solving multi-objective optimization problems for large deep neural networks is a challenging task due to the complexity of the loss landscape and the expensive computational cost.
We propose a practical and scalable approach to solve this problem via mixture of experts (MoE) based model fusion.
By ensembling the weights of specialized single-task models, the MoE module can effectively capture the trade-offs between multiple objectives.
arXiv Detail & Related papers (2024-06-14T07:16:18Z) - Intrusion Detection System with Machine Learning and Multiple Datasets [0.0]
In this paper, an enhanced intrusion detection system (IDS) that utilizes machine learning (ML) is explored.
Ultimately, this improved system can be used to combat the attacks made by unethical hackers.
arXiv Detail & Related papers (2023-12-04T14:58:19Z) - Fast Unsupervised Deep Outlier Model Selection with Hypernetworks [32.15262629879272]
We introduce HYPER for tuning DOD models, tackling two fundamental challenges: validation without supervision, and efficient search of the HP/model space.
A key idea is to design and train a novel hypernetwork (HN) that maps HPs onto optimal weights of the main DOD model.
In turn, HYPER capitalizes on a single HN that can dynamically generate weights for many DOD models.
arXiv Detail & Related papers (2023-07-20T02:07:20Z) - Unleashing the Potential of Unsupervised Deep Outlier Detection through
Automated Training Stopping [33.99128209697431]
Outlier detection (OD) has received continuous research interests due to its wide applications.
We propose a novel metric called loss entropy to internally evaluate the model performance during training.
Our approach is the first to enable reliable identification of the optimal training during training without requiring any labels.
arXiv Detail & Related papers (2023-05-26T09:39:36Z) - Dynamically-Scaled Deep Canonical Correlation Analysis [77.34726150561087]
Canonical Correlation Analysis (CCA) is a method for feature extraction of two views by finding maximally correlated linear projections of them.
We introduce a novel dynamic scaling method for training an input-dependent canonical correlation model.
arXiv Detail & Related papers (2022-03-23T12:52:49Z) - Multi-Scale Aligned Distillation for Low-Resolution Detection [68.96325141432078]
This paper focuses on boosting the performance of low-resolution models by distilling knowledge from a high- or multi-resolution model.
On several instance-level detection tasks and datasets, the low-resolution models trained via our approach perform competitively with high-resolution models trained via conventional multi-scale training.
arXiv Detail & Related papers (2021-09-14T12:53:35Z) - Practical and sample efficient zero-shot HPO [8.41866793161234]
We provide an overview of available approaches and introduce two novel techniques to handle the problem.
The first is based on a surrogate model and adaptively chooses pairs of dataset, configuration to query.
The second is for settings where finding, tuning and testing a surrogate model is problematic, is a multi-fidelity technique combining HyperBand with submodular optimization.
arXiv Detail & Related papers (2020-07-27T08:56:55Z) - One-Shot Object Detection without Fine-Tuning [62.39210447209698]
We introduce a two-stage model consisting of a first stage Matching-FCOS network and a second stage Structure-Aware Relation Module.
We also propose novel training strategies that effectively improve detection performance.
Our method exceeds the state-of-the-art one-shot performance consistently on multiple datasets.
arXiv Detail & Related papers (2020-05-08T01:59:23Z) - Better Trees: An empirical study on hyperparameter tuning of
classification decision tree induction algorithms [5.4611430411491115]
Decision Tree (DT) induction algorithms present high predictive performance and interpretable classification models.
This paper investigates the effects of hyperparameter tuning for the two DT induction algorithms most often used, CART and C4.5.
Experiments were carried out with different tuning strategies to induce models and to evaluate HPs' relevance using 94 classification datasets from OpenML.
arXiv Detail & Related papers (2018-12-05T19:59:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.