Hyper-Parameter Optimization: A Review of Algorithms and Applications
- URL: http://arxiv.org/abs/2003.05689v1
- Date: Thu, 12 Mar 2020 10:12:22 GMT
- Title: Hyper-Parameter Optimization: A Review of Algorithms and Applications
- Authors: Tong Yu and Hong Zhu
- Abstract summary: This paper provides a review of the most essential topics on automated hyper- parameter optimization (HPO)
The research focuses on major optimization algorithms and their applicability, covering their efficiency and accuracy especially for deep learning networks.
The paper concludes with problems that exist when HPO is applied to deep learning, a comparison between optimization algorithms, and prominent approaches for model evaluation with limited computational resources.
- Score: 14.524227656147968
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Since deep neural networks were developed, they have made huge contributions
to everyday lives. Machine learning provides more rational advice than humans
are capable of in almost every aspect of daily life. However, despite this
achievement, the design and training of neural networks are still challenging
and unpredictable procedures. To lower the technical thresholds for common
users, automated hyper-parameter optimization (HPO) has become a popular topic
in both academic and industrial areas. This paper provides a review of the most
essential topics on HPO. The first section introduces the key hyper-parameters
related to model training and structure, and discusses their importance and
methods to define the value range. Then, the research focuses on major
optimization algorithms and their applicability, covering their efficiency and
accuracy especially for deep learning networks. This study next reviews major
services and toolkits for HPO, comparing their support for state-of-the-art
searching algorithms, feasibility with major deep learning frameworks, and
extensibility for new modules designed by users. The paper concludes with
problems that exist when HPO is applied to deep learning, a comparison between
optimization algorithms, and prominent approaches for model evaluation with
limited computational resources.
Related papers
- Efficient Hyperparameter Importance Assessment for CNNs [1.7778609937758323]
This paper aims to quantify the importance weights of some hyperparameters in Convolutional Neural Networks (CNNs) with an algorithm called N-RReliefF.
We conduct an extensive study by training over ten thousand CNN models across ten popular image classification datasets.
arXiv Detail & Related papers (2024-10-11T15:47:46Z) - Principled Architecture-aware Scaling of Hyperparameters [69.98414153320894]
Training a high-quality deep neural network requires choosing suitable hyperparameters, which is a non-trivial and expensive process.
In this work, we precisely characterize the dependence of initializations and maximal learning rates on the network architecture.
We demonstrate that network rankings can be easily changed by better training networks in benchmarks.
arXiv Detail & Related papers (2024-02-27T11:52:49Z) - Secrets of RLHF in Large Language Models Part I: PPO [81.01936993929127]
Large language models (LLMs) have formulated a blueprint for the advancement of artificial general intelligence.
reinforcement learning with human feedback (RLHF) emerges as the pivotal technological paradigm underpinning this pursuit.
In this report, we dissect the framework of RLHF, re-evaluate the inner workings of PPO, and explore how the parts comprising PPO algorithms impact policy agent training.
arXiv Detail & Related papers (2023-07-11T01:55:24Z) - PriorBand: Practical Hyperparameter Optimization in the Age of Deep
Learning [49.92394599459274]
We propose PriorBand, an HPO algorithm tailored to Deep Learning (DL) pipelines.
We show its robustness across a range of DL benchmarks and show its gains under informative expert input and against poor expert beliefs.
arXiv Detail & Related papers (2023-06-21T16:26:14Z) - Improved Algorithms for Neural Active Learning [74.89097665112621]
We improve the theoretical and empirical performance of neural-network(NN)-based active learning algorithms for the non-parametric streaming setting.
We introduce two regret metrics by minimizing the population loss that are more suitable in active learning than the one used in state-of-the-art (SOTA) related work.
arXiv Detail & Related papers (2022-10-02T05:03:38Z) - Design Automation for Fast, Lightweight, and Effective Deep Learning
Models: A Survey [53.258091735278875]
This survey covers studies of design automation techniques for deep learning models targeting edge computing.
It offers an overview and comparison of key metrics that are used commonly to quantify the proficiency of models in terms of effectiveness, lightness, and computational costs.
The survey proceeds to cover three categories of the state-of-the-art of deep model design automation techniques.
arXiv Detail & Related papers (2022-08-22T12:12:43Z) - Resource-Efficient Deep Learning: A Survey on Model-, Arithmetic-, and
Implementation-Level Techniques [10.715525749057495]
Deep learning is pervasive in our daily life, including self-driving cars, virtual assistants, social network services, healthcare services, face recognition, etc.
Deep neural networks demand substantial compute resources during training and inference.
This article provides a survey on resource-efficient deep learning techniques in terms of model-, arithmetic-, and implementation-level techniques.
arXiv Detail & Related papers (2021-12-30T17:00:06Z) - Automated Benchmark-Driven Design and Explanation of Hyperparameter
Optimizers [3.729201909920989]
We present a principled approach to automated benchmark-driven algorithm design applied to multi parameter HPO (MF-HPO)
First, we formalize a rich space of MF-HPO candidates that includes, but is not limited to common HPO algorithms, and then present a framework covering this space.
We challenge whether the found design choices are necessary or could be replaced by more naive and simpler ones by performing an ablation analysis.
arXiv Detail & Related papers (2021-11-29T18:02:56Z) - A survey on multi-objective hyperparameter optimization algorithms for
Machine Learning [62.997667081978825]
This article presents a systematic survey of the literature published between 2014 and 2020 on multi-objective HPO algorithms.
We distinguish between metaheuristic-based algorithms, metamodel-based algorithms, and approaches using a mixture of both.
We also discuss the quality metrics used to compare multi-objective HPO procedures and present future research directions.
arXiv Detail & Related papers (2021-11-23T10:22:30Z) - Benchmarking the Accuracy and Robustness of Feedback Alignment
Algorithms [1.2183405753834562]
Backpropagation is the default algorithm for training deep neural networks due to its simplicity, efficiency and high convergence rate.
In recent years, more biologically plausible learning methods have been proposed.
BioTorch is a software framework to create, train, and benchmark biologically motivated neural networks.
arXiv Detail & Related papers (2021-08-30T18:02:55Z) - Hyperparameter Optimization: Foundations, Algorithms, Best Practices and
Open Challenges [5.139260825952818]
This paper reviews important HPO methods such as grid or random search, evolutionary algorithms, Bayesian optimization, Hyperband and racing.
It gives practical recommendations regarding important choices to be made when conducting HPO, including the HPO algorithms themselves, performance evaluation, how to combine HPO with ML pipelines, runtime improvements, and parallelization.
arXiv Detail & Related papers (2021-07-13T04:55:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.