Related papers: Hyper-Parameter Optimization: A Review of Algorithms and Applications

Hyper-Parameter Optimization: A Review of Algorithms and Applications

URL: http://arxiv.org/abs/2003.05689v1
Date: Thu, 12 Mar 2020 10:12:22 GMT
Title: Hyper-Parameter Optimization: A Review of Algorithms and Applications
Authors: Tong Yu and Hong Zhu
Abstract summary: This paper provides a review of the most essential topics on automated hyper- parameter optimization (HPO) The research focuses on major optimization algorithms and their applicability, covering their efficiency and accuracy especially for deep learning networks. The paper concludes with problems that exist when HPO is applied to deep learning, a comparison between optimization algorithms, and prominent approaches for model evaluation with limited computational resources.
Score: 14.524227656147968
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Since deep neural networks were developed, they have made huge contributions to everyday lives. Machine learning provides more rational advice than humans are capable of in almost every aspect of daily life. However, despite this achievement, the design and training of neural networks are still challenging and unpredictable procedures. To lower the technical thresholds for common users, automated hyper-parameter optimization (HPO) has become a popular topic in both academic and industrial areas. This paper provides a review of the most essential topics on HPO. The first section introduces the key hyper-parameters related to model training and structure, and discusses their importance and methods to define the value range. Then, the research focuses on major optimization algorithms and their applicability, covering their efficiency and accuracy especially for deep learning networks. This study next reviews major services and toolkits for HPO, comparing their support for state-of-the-art searching algorithms, feasibility with major deep learning frameworks, and extensibility for new modules designed by users. The paper concludes with problems that exist when HPO is applied to deep learning, a comparison between optimization algorithms, and prominent approaches for model evaluation with limited computational resources.

Related papers

ULTHO: Ultra-Lightweight yet Efficient Hyperparameter Optimization in Deep Reinforcement Learning [50.53705050673944]
We propose ULTHO, an ultra-lightweight yet powerful framework for fast HPO in deep RL within single runs. Specifically, we formulate the HPO process as a multi-armed bandit with clustered arms (MABC) and link it directly to long-term return optimization. We test ULTHO on benchmarks including ALE, Procgen, MiniGrid, and PyBullet.
arXiv Detail & Related papers (2025-03-08T07:03:43Z)
Grouped Sequential Optimization Strategy -- the Application of Hyperparameter Importance Assessment in Deep Learning [1.7778609937758323]
We implement a novel HPO strategy called 'Sequential Grouping' Our experiments, validated across six additional image classification datasets, demonstrate that incorporating hyper parameter importance assessment (HIA) can significantly accelerate HPO without compromising model performance.
arXiv Detail & Related papers (2025-03-07T03:01:00Z)
HyperQ-Opt: Q-learning for Hyperparameter Optimization [0.0]
This paper presents a novel perspective on HPO by formulating it as a sequential decision-making problem and leveraging Q-learning, a reinforcement learning technique. The approaches are evaluated for their ability to find optimal or near-optimal configurations within a limited number of trials. By shifting the paradigm toward policy-based optimization, this work contributes to advancing HPO methods for scalable and efficient machine learning applications.
arXiv Detail & Related papers (2024-12-23T18:22:34Z)
Efficient Hyperparameter Importance Assessment for CNNs [1.7778609937758323]
This paper aims to quantify the importance weights of some hyperparameters in Convolutional Neural Networks (CNNs) with an algorithm called N-RReliefF. We conduct an extensive study by training over ten thousand CNN models across ten popular image classification datasets.
arXiv Detail & Related papers (2024-10-11T15:47:46Z)
Principled Architecture-aware Scaling of Hyperparameters [69.98414153320894]
Training a high-quality deep neural network requires choosing suitable hyperparameters, which is a non-trivial and expensive process. In this work, we precisely characterize the dependence of initializations and maximal learning rates on the network architecture. We demonstrate that network rankings can be easily changed by better training networks in benchmarks.
arXiv Detail & Related papers (2024-02-27T11:52:49Z)
Secrets of RLHF in Large Language Models Part I: PPO [81.01936993929127]
Large language models (LLMs) have formulated a blueprint for the advancement of artificial general intelligence. reinforcement learning with human feedback (RLHF) emerges as the pivotal technological paradigm underpinning this pursuit. In this report, we dissect the framework of RLHF, re-evaluate the inner workings of PPO, and explore how the parts comprising PPO algorithms impact policy agent training.
arXiv Detail & Related papers (2023-07-11T01:55:24Z)
PriorBand: Practical Hyperparameter Optimization in the Age of Deep Learning [49.92394599459274]
We propose PriorBand, an HPO algorithm tailored to Deep Learning (DL) pipelines. We show its robustness across a range of DL benchmarks and show its gains under informative expert input and against poor expert beliefs.
arXiv Detail & Related papers (2023-06-21T16:26:14Z)
Improved Algorithms for Neural Active Learning [74.89097665112621]
We improve the theoretical and empirical performance of neural-network(NN)-based active learning algorithms for the non-parametric streaming setting. We introduce two regret metrics by minimizing the population loss that are more suitable in active learning than the one used in state-of-the-art (SOTA) related work.
arXiv Detail & Related papers (2022-10-02T05:03:38Z)
Design Automation for Fast, Lightweight, and Effective Deep Learning Models: A Survey [53.258091735278875]
This survey covers studies of design automation techniques for deep learning models targeting edge computing. It offers an overview and comparison of key metrics that are used commonly to quantify the proficiency of models in terms of effectiveness, lightness, and computational costs. The survey proceeds to cover three categories of the state-of-the-art of deep model design automation techniques.
arXiv Detail & Related papers (2022-08-22T12:12:43Z)
Resource-Efficient Deep Learning: A Survey on Model-, Arithmetic-, and Implementation-Level Techniques [10.715525749057495]
Deep learning is pervasive in our daily life, including self-driving cars, virtual assistants, social network services, healthcare services, face recognition, etc. Deep neural networks demand substantial compute resources during training and inference. This article provides a survey on resource-efficient deep learning techniques in terms of model-, arithmetic-, and implementation-level techniques.
arXiv Detail & Related papers (2021-12-30T17:00:06Z)
Automated Benchmark-Driven Design and Explanation of Hyperparameter Optimizers [3.729201909920989]
We present a principled approach to automated benchmark-driven algorithm design applied to multi parameter HPO (MF-HPO) First, we formalize a rich space of MF-HPO candidates that includes, but is not limited to common HPO algorithms, and then present a framework covering this space. We challenge whether the found design choices are necessary or could be replaced by more naive and simpler ones by performing an ablation analysis.
arXiv Detail & Related papers (2021-11-29T18:02:56Z)
A survey on multi-objective hyperparameter optimization algorithms for Machine Learning [62.997667081978825]
This article presents a systematic survey of the literature published between 2014 and 2020 on multi-objective HPO algorithms. We distinguish between metaheuristic-based algorithms, metamodel-based algorithms, and approaches using a mixture of both. We also discuss the quality metrics used to compare multi-objective HPO procedures and present future research directions.
arXiv Detail & Related papers (2021-11-23T10:22:30Z)
Benchmarking the Accuracy and Robustness of Feedback Alignment Algorithms [1.2183405753834562]
Backpropagation is the default algorithm for training deep neural networks due to its simplicity, efficiency and high convergence rate. In recent years, more biologically plausible learning methods have been proposed. BioTorch is a software framework to create, train, and benchmark biologically motivated neural networks.
arXiv Detail & Related papers (2021-08-30T18:02:55Z)
Hyperparameter Optimization: Foundations, Algorithms, Best Practices and Open Challenges [5.139260825952818]
This paper reviews important HPO methods such as grid or random search, evolutionary algorithms, Bayesian optimization, Hyperband and racing. It gives practical recommendations regarding important choices to be made when conducting HPO, including the HPO algorithms themselves, performance evaluation, how to combine HPO with ML pipelines, runtime improvements, and parallelization.
arXiv Detail & Related papers (2021-07-13T04:55:47Z)

This list is automatically generated from the titles and abstracts of the papers in this site.