Fast Unsupervised Deep Outlier Model Selection with Hypernetworks
- URL: http://arxiv.org/abs/2307.10529v3
- Date: Sat, 24 Aug 2024 20:39:06 GMT
- Title: Fast Unsupervised Deep Outlier Model Selection with Hypernetworks
- Authors: Xueying Ding, Yue Zhao, Leman Akoglu,
- Abstract summary: We introduce HYPER for tuning DOD models, tackling two fundamental challenges: validation without supervision, and efficient search of the HP/model space.
A key idea is to design and train a novel hypernetwork (HN) that maps HPs onto optimal weights of the main DOD model.
In turn, HYPER capitalizes on a single HN that can dynamically generate weights for many DOD models.
- Score: 32.15262629879272
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Outlier detection (OD) finds many applications with a rich literature of numerous techniques. Deep neural network based OD (DOD) has seen a recent surge of attention thanks to the many advances in deep learning. In this paper, we consider a critical-yet-understudied challenge with unsupervised DOD, that is, effective hyperparameter (HP) tuning/model selection. While several prior work report the sensitivity of OD models to HPs, it becomes ever so critical for the modern DOD models that exhibit a long list of HPs. We introduce HYPER for tuning DOD models, tackling two fundamental challenges: (1) validation without supervision (due to lack of labeled anomalies), and (2) efficient search of the HP/model space (due to exponential growth in the number of HPs). A key idea is to design and train a novel hypernetwork (HN) that maps HPs onto optimal weights of the main DOD model. In turn, HYPER capitalizes on a single HN that can dynamically generate weights for many DOD models (corresponding to varying HPs), which offers significant speed-up. In addition, it employs meta-learning on historical OD tasks with labels to train a proxy validation function, likewise trained with our proposed HN efficiently. Extensive experiments on 35 OD tasks show that HYPER achieves high performance against 8 baselines with significant efficiency gains.
Related papers
- From Algorithm to Hardware: A Survey on Efficient and Safe Deployment of Deep Neural Networks [23.928893359202753]
Deep neural networks (DNNs) have been widely used in many artificial intelligence (AI) tasks.
deploying them brings significant challenges due to the huge cost of memory, energy, and computation.
Recently, there has been a surge in research of compression methods to achieve model efficiency while retaining the performance.
arXiv Detail & Related papers (2024-05-09T18:17:25Z) - DistiLLM: Towards Streamlined Distillation for Large Language Models [53.46759297929675]
DistiLLM is a more effective and efficient KD framework for auto-regressive language models.
DisiLLM comprises two components: (1) a novel skew Kullback-Leibler divergence loss, where we unveil and leverage its theoretical properties, and (2) an adaptive off-policy approach designed to enhance the efficiency in utilizing student-generated outputs.
arXiv Detail & Related papers (2024-02-06T11:10:35Z) - Efficient Adaptive Human-Object Interaction Detection with
Concept-guided Memory [64.11870454160614]
We propose an efficient Adaptive HOI Detector with Concept-guided Memory (ADA-CM)
ADA-CM has two operating modes. The first mode makes it tunable without learning new parameters in a training-free paradigm.
Our proposed method achieves competitive results with state-of-the-art on the HICO-DET and V-COCO datasets with much less training time.
arXiv Detail & Related papers (2023-09-07T13:10:06Z) - Unleashing the Potential of Unsupervised Deep Outlier Detection through
Automated Training Stopping [33.99128209697431]
Outlier detection (OD) has received continuous research interests due to its wide applications.
We propose a novel metric called loss entropy to internally evaluate the model performance during training.
Our approach is the first to enable reliable identification of the optimal training during training without requiring any labels.
arXiv Detail & Related papers (2023-05-26T09:39:36Z) - Applying Deep Reinforcement Learning to the HP Model for Protein
Structure Prediction [7.769624124148049]
In this work, we apply deep reinforcement learning to the HP model for protein folding.
We find that a DQN based on long short-term memory (LSTM) architecture greatly enhances the RL learning ability and significantly improves the search process.
Experimentally we show that it can find multiple distinct best-known solutions per trial.
arXiv Detail & Related papers (2022-11-27T21:17:48Z) - Towards Unsupervised HPO for Outlier Detection [23.77292404327994]
We propose the first systematic approach called HPOD that is based on meta-learning.
HPOD capitalizes on the prior performance of a large collection of HPs on existing OD benchmark datasets.
It adapts (originally supervised) sequential model-based optimization to identify promising HPs efficiently.
arXiv Detail & Related papers (2022-08-24T18:11:22Z) - Hyperparameter Sensitivity in Deep Outlier Detection: Analysis and a
Scalable Hyper-Ensemble Solution [21.130842136324528]
We conduct the first large-scale analysis on the HP sensitivity of deep OD methods.
We design a HP-robust and scalable deep hyper-ensemble model called ROBOD that assembles models with varying HP configurations.
arXiv Detail & Related papers (2022-06-15T16:46:00Z) - Deep Reinforcement Learning with Spiking Q-learning [51.386945803485084]
spiking neural networks (SNNs) are expected to realize artificial intelligence (AI) with less energy consumption.
It provides a promising energy-efficient way for realistic control tasks by combining SNNs with deep reinforcement learning (RL)
arXiv Detail & Related papers (2022-01-21T16:42:11Z) - Multi-Scale Aligned Distillation for Low-Resolution Detection [68.96325141432078]
This paper focuses on boosting the performance of low-resolution models by distilling knowledge from a high- or multi-resolution model.
On several instance-level detection tasks and datasets, the low-resolution models trained via our approach perform competitively with high-resolution models trained via conventional multi-scale training.
arXiv Detail & Related papers (2021-09-14T12:53:35Z) - HyperSTAR: Task-Aware Hyperparameters for Deep Networks [52.50861379908611]
HyperSTAR is a task-aware method to warm-start HPO for deep neural networks.
It learns a dataset (task) representation along with the performance predictor directly from raw images.
It evaluates 50% less configurations to achieve the best performance compared to existing methods.
arXiv Detail & Related papers (2020-05-21T08:56:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.