HyperSTAR: Task-Aware Hyperparameters for Deep Networks
- URL: http://arxiv.org/abs/2005.10524v1
- Date: Thu, 21 May 2020 08:56:50 GMT
- Title: HyperSTAR: Task-Aware Hyperparameters for Deep Networks
- Authors: Gaurav Mittal, Chang Liu, Nikolaos Karianakis, Victor Fragoso, Mei
Chen, Yun Fu
- Abstract summary: HyperSTAR is a task-aware method to warm-start HPO for deep neural networks.
It learns a dataset (task) representation along with the performance predictor directly from raw images.
It evaluates 50% less configurations to achieve the best performance compared to existing methods.
- Score: 52.50861379908611
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: While deep neural networks excel in solving visual recognition tasks, they
require significant effort to find hyperparameters that make them work
optimally. Hyperparameter Optimization (HPO) approaches have automated the
process of finding good hyperparameters but they do not adapt to a given task
(task-agnostic), making them computationally inefficient. To reduce HPO time,
we present HyperSTAR (System for Task Aware Hyperparameter Recommendation), a
task-aware method to warm-start HPO for deep neural networks. HyperSTAR ranks
and recommends hyperparameters by predicting their performance conditioned on a
joint dataset-hyperparameter space. It learns a dataset (task) representation
along with the performance predictor directly from raw images in an end-to-end
fashion. The recommendations, when integrated with an existing HPO method, make
it task-aware and significantly reduce the time to achieve optimal performance.
We conduct extensive experiments on 10 publicly available large-scale image
classification datasets over two different network architectures, validating
that HyperSTAR evaluates 50% less configurations to achieve the best
performance compared to existing methods. We further demonstrate that HyperSTAR
makes Hyperband (HB) task-aware, achieving the optimal accuracy in just 25% of
the budget required by both vanilla HB and Bayesian Optimized HB~(BOHB).
Related papers
- Deep Ranking Ensembles for Hyperparameter Optimization [9.453554184019108]
We present a novel method that meta-learns neural network surrogates optimized for ranking the configurations' performances while modeling their uncertainty via ensembling.
In a large-scale experimental protocol comprising 12 baselines, 16 HPO search spaces and 86 datasets/tasks, we demonstrate that our method achieves new state-of-the-art results in HPO.
arXiv Detail & Related papers (2023-03-27T13:52:40Z) - A New Linear Scaling Rule for Private Adaptive Hyperparameter Optimization [57.450449884166346]
We propose an adaptive HPO method to account for the privacy cost of HPO.
We obtain state-of-the-art performance on 22 benchmark tasks, across computer vision and natural language processing, across pretraining and finetuning.
arXiv Detail & Related papers (2022-12-08T18:56:37Z) - AUTOMATA: Gradient Based Data Subset Selection for Compute-Efficient
Hyper-parameter Tuning [72.54359545547904]
We propose a gradient-based subset selection framework for hyper- parameter tuning.
We show that using gradient-based data subsets for hyper- parameter tuning achieves significantly faster turnaround times and speedups of 3$times$-30$times$.
arXiv Detail & Related papers (2022-03-15T19:25:01Z) - Towards Robust and Automatic Hyper-Parameter Tunning [39.04604349338802]
We introduce a new class of HPO method and explore how the low-rank factorization of intermediate layers of a convolutional network can be used to define an analytical response surface.
We quantify how this surface behaves as a surrogate to model performance and can be solved using a trust-region search algorithm, which we call autoHyper.
arXiv Detail & Related papers (2021-11-28T05:27:34Z) - Hyperparameter Tuning is All You Need for LISTA [92.7008234085887]
Learned Iterative Shrinkage-Thresholding Algorithm (LISTA) introduces the concept of unrolling an iterative algorithm and training it like a neural network.
We show that adding momentum to intermediate variables in the LISTA network achieves a better convergence rate.
We call this new ultra-light weight network HyperLISTA.
arXiv Detail & Related papers (2021-10-29T16:35:38Z) - HyperNP: Interactive Visual Exploration of Multidimensional Projection
Hyperparameters [61.354362652006834]
HyperNP is a scalable method that allows for real-time interactive exploration of projection methods by training neural network approximations.
We evaluate the performance of the HyperNP across three datasets in terms of performance and speed.
arXiv Detail & Related papers (2021-06-25T17:28:14Z) - Optimizing Large-Scale Hyperparameters via Automated Learning Algorithm [97.66038345864095]
We propose a new hyperparameter optimization method with zeroth-order hyper-gradients (HOZOG)
Specifically, we first formulate hyperparameter optimization as an A-based constrained optimization problem.
Then, we use the average zeroth-order hyper-gradients to update hyper parameters.
arXiv Detail & Related papers (2021-02-17T21:03:05Z) - Practical and sample efficient zero-shot HPO [8.41866793161234]
We provide an overview of available approaches and introduce two novel techniques to handle the problem.
The first is based on a surrogate model and adaptively chooses pairs of dataset, configuration to query.
The second is for settings where finding, tuning and testing a surrogate model is problematic, is a multi-fidelity technique combining HyperBand with submodular optimization.
arXiv Detail & Related papers (2020-07-27T08:56:55Z) - Automatic Setting of DNN Hyper-Parameters by Mixing Bayesian
Optimization and Tuning Rules [0.6875312133832078]
We build a new algorithm for evaluating and analyzing the results of the network on the training and validation sets.
We use a set of tuning rules to add new hyper-parameters and/or to reduce the hyper- parameter search space to select a better combination.
arXiv Detail & Related papers (2020-06-03T08:53:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.