Hyperparameter Optimization with Neural Network Pruning
- URL: http://arxiv.org/abs/2205.08695v1
- Date: Wed, 18 May 2022 02:51:47 GMT
- Title: Hyperparameter Optimization with Neural Network Pruning
- Authors: Kangil Lee, Junho Yim
- Abstract summary: We propose a proxy model for a neural network (N_B) to be used for hyperparameter optimization.
The proposed framework can reduce the amount of time up to 37%.
- Score: 6.193231258199234
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Since the deep learning model is highly dependent on hyperparameters,
hyperparameter optimization is essential in developing deep learning
model-based applications, even if it takes a long time. As service development
using deep learning models has gradually become competitive, many developers
highly demand rapid hyperparameter optimization algorithms. In order to keep
pace with the needs of faster hyperparameter optimization algorithms,
researchers are focusing on improving the speed of hyperparameter optimization
algorithm. However, the huge time consumption of hyperparameter optimization
due to the high computational cost of the deep learning model itself has not
been dealt with in-depth. Like using surrogate model in Bayesian optimization,
to solve this problem, it is necessary to consider proxy model for a neural
network (N_B) to be used for hyperparameter optimization. Inspired by the main
goal of neural network pruning, i.e., high computational cost reduction and
performance preservation, we presumed that the neural network (N_P) obtained
through neural network pruning would be a good proxy model of N_B. In order to
verify our idea, we performed extensive experiments by using CIFAR10, CFIAR100,
and TinyImageNet datasets and three generally-used neural networks and three
representative hyperparameter optmization methods. Through these experiments,
we verified that N_P can be a good proxy model of N_B for rapid hyperparameter
optimization. The proposed hyperparameter optimization framework can reduce the
amount of time up to 37%.
Related papers
- Streamlining Ocean Dynamics Modeling with Fourier Neural Operators: A Multiobjective Hyperparameter and Architecture Optimization Approach [5.232806761554172]
We use the advanced search algorithms for multiobjective optimization in DeepHyper to streamline the development of neural networks tailored for ocean modeling.
We demonstrate an approach to enhance the use of FNOs in ocean dynamics forecasting, offering a scalable solution with improved precision.
arXiv Detail & Related papers (2024-04-07T14:29:23Z) - Principled Architecture-aware Scaling of Hyperparameters [69.98414153320894]
Training a high-quality deep neural network requires choosing suitable hyperparameters, which is a non-trivial and expensive process.
In this work, we precisely characterize the dependence of initializations and maximal learning rates on the network architecture.
We demonstrate that network rankings can be easily changed by better training networks in benchmarks.
arXiv Detail & Related papers (2024-02-27T11:52:49Z) - Towards Theoretically Inspired Neural Initialization Optimization [66.04735385415427]
We propose a differentiable quantity, named GradCosine, with theoretical insights to evaluate the initial state of a neural network.
We show that both the training and test performance of a network can be improved by maximizing GradCosine under norm constraint.
Generalized from the sample-wise analysis into the real batch setting, NIO is able to automatically look for a better initialization with negligible cost.
arXiv Detail & Related papers (2022-10-12T06:49:16Z) - Improving Multi-fidelity Optimization with a Recurring Learning Rate for
Hyperparameter Tuning [7.591442522626255]
We propose Multi-fidelity Optimization with a Recurring Learning rate (MORL)
MORL incorporates CNNs' optimization process into multi-fidelity optimization.
It alleviates the problem of slow-starter and achieves a more precise low-fidelity approximation.
arXiv Detail & Related papers (2022-09-26T08:16:31Z) - Optimizing Large-Scale Hyperparameters via Automated Learning Algorithm [97.66038345864095]
We propose a new hyperparameter optimization method with zeroth-order hyper-gradients (HOZOG)
Specifically, we first formulate hyperparameter optimization as an A-based constrained optimization problem.
Then, we use the average zeroth-order hyper-gradients to update hyper parameters.
arXiv Detail & Related papers (2021-02-17T21:03:05Z) - Online hyperparameter optimization by real-time recurrent learning [57.01871583756586]
Our framework takes advantage of the analogy between hyperparameter optimization and parameter learning in neural networks (RNNs)
It adapts a well-studied family of online learning algorithms for RNNs to tune hyperparameters and network parameters simultaneously.
This procedure yields systematically better generalization performance compared to standard methods, at a fraction of wallclock time.
arXiv Detail & Related papers (2021-02-15T19:36:18Z) - Delta-STN: Efficient Bilevel Optimization for Neural Networks using
Structured Response Jacobians [5.33024001730262]
Self-Tuning Networks (STNs) have recently gained traction due to their ability to amortize the optimization of the inner objective.
We propose the $Delta$-STN, an improved hypernetwork architecture which stabilizes training.
arXiv Detail & Related papers (2020-10-26T12:12:23Z) - Optimizing Memory Placement using Evolutionary Graph Reinforcement
Learning [56.83172249278467]
We introduce Evolutionary Graph Reinforcement Learning (EGRL), a method designed for large search spaces.
We train and validate our approach directly on the Intel NNP-I chip for inference.
We additionally achieve 28-78% speed-up compared to the native NNP-I compiler on all three workloads.
arXiv Detail & Related papers (2020-07-14T18:50:12Z) - An Asymptotically Optimal Multi-Armed Bandit Algorithm and
Hyperparameter Optimization [48.5614138038673]
We propose an efficient and robust bandit-based algorithm called Sub-Sampling (SS) in the scenario of hyper parameter search evaluation.
We also develop a novel hyper parameter optimization algorithm called BOSS.
Empirical studies validate our theoretical arguments of SS and demonstrate the superior performance of BOSS on a number of applications.
arXiv Detail & Related papers (2020-07-11T03:15:21Z) - Weighting Is Worth the Wait: Bayesian Optimization with Importance
Sampling [34.67740033646052]
We improve upon Bayesian optimization state-of-the-art runtime and final validation error across a variety of datasets and complex neural architectures.
By learning a parameterization of IS that trades-off evaluation complexity and quality, we improve upon Bayesian optimization state-of-the-art runtime and final validation error across a variety of datasets and complex neural architectures.
arXiv Detail & Related papers (2020-02-23T15:52:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.