Related papers: Use of static surrogates in hyperparameter optimization

Use of static surrogates in hyperparameter optimization

URL: http://arxiv.org/abs/2103.07963v1
Date: Sun, 14 Mar 2021 16:15:53 GMT
Title: Use of static surrogates in hyperparameter optimization
Authors: Dounia Lakhmiri and S\'ebastien Le Digabel
Abstract summary: This work aims at enhancing HyperNOMAD, a library that adapts a direct search derivative-free optimization algorithm to tune both the architecture and the training of a neural network simultaneously. These additions to HyperNOMAD are shown to improve on its resources consumption without harming the quality of the proposed solutions.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Optimizing the hyperparameters and architecture of a neural network is a long yet necessary phase in the development of any new application. This consuming process can benefit from the elaboration of strategies designed to quickly discard low quality configurations and focus on more promising candidates. This work aims at enhancing HyperNOMAD, a library that adapts a direct search derivative-free optimization algorithm to tune both the architecture and the training of a neural network simultaneously, by targeting two keys steps of its execution and exploiting cheap approximations in the form of static surrogates to trigger the early stopping of the evaluation of a configuration and the ranking of pools of candidates. These additions to HyperNOMAD are shown to improve on its resources consumption without harming the quality of the proposed solutions.

Related papers

MetaML-Pro: Cross-Stage Design Flow Automation for Efficient Deep Learning Acceleration [8.43012094714496]
This paper presents a unified framework for codifying and automating optimization strategies to deploy deep neural networks (DNNs) on resource-constrained hardware. Our novel approach addresses two key issues: cross-stage co-optimization and optimization search. Experimental results demonstrate up to a 92% DSP and 89% LUT usage reduction for select networks.
arXiv Detail & Related papers (2025-02-09T11:02:06Z)
Memory-Efficient Optimization with Factorized Hamiltonian Descent [11.01832755213396]
We introduce a novel adaptive, H-Fac, which incorporates a memory-efficient factorization approach to address this challenge. By employing a rank-1 parameterization for both momentum and scaling parameter estimators, H-Fac reduces memory costs to a sublinear level. We develop our algorithms based on principles derived from Hamiltonian dynamics, providing robust theoretical underpinnings in optimization dynamics and convergence guarantees.
arXiv Detail & Related papers (2024-06-14T12:05:17Z)
DNN Partitioning, Task Offloading, and Resource Allocation in Dynamic Vehicular Networks: A Lyapunov-Guided Diffusion-Based Reinforcement Learning Approach [49.56404236394601]
We formulate the problem of joint DNN partitioning, task offloading, and resource allocation in Vehicular Edge Computing. Our objective is to minimize the DNN-based task completion time while guaranteeing the system stability over time. We propose a Multi-Agent Diffusion-based Deep Reinforcement Learning (MAD2RL) algorithm, incorporating the innovative use of diffusion models.
arXiv Detail & Related papers (2024-06-11T06:31:03Z)
Parallel Hyperparameter Optimization Of Spiking Neural Network [0.5371337604556311]
Spiking Neural Networks (SNNs) are based on a more biologically inspired approach than usual artificial neural networks. We tackle the signal loss issue of SNNs to what we call silent networks. By defining an early stopping criterion, we were able to instantiate larger and more flexible search spaces.
arXiv Detail & Related papers (2024-03-01T11:11:59Z)
Principled Architecture-aware Scaling of Hyperparameters [69.98414153320894]
Training a high-quality deep neural network requires choosing suitable hyperparameters, which is a non-trivial and expensive process. In this work, we precisely characterize the dependence of initializations and maximal learning rates on the network architecture. We demonstrate that network rankings can be easily changed by better training networks in benchmarks.
arXiv Detail & Related papers (2024-02-27T11:52:49Z)
Towards Hyperparameter-Agnostic DNN Training via Dynamical System Insights [4.513581513983453]
We present a first-order optimization method specialized for deep neural networks (DNNs), ECCO-DNN. This method models the optimization variable trajectory as a dynamical system and develops a discretization algorithm that adaptively selects step sizes based on the trajectory's shape.
arXiv Detail & Related papers (2023-10-21T03:45:13Z)
Federated Multi-Level Optimization over Decentralized Networks [55.776919718214224]
We study the problem of distributed multi-level optimization over a network, where agents can only communicate with their immediate neighbors. We propose a novel gossip-based distributed multi-level optimization algorithm that enables networked agents to solve optimization problems at different levels in a single timescale. Our algorithm achieves optimal sample complexity, scaling linearly with the network size, and demonstrates state-of-the-art performance on various applications.
arXiv Detail & Related papers (2023-10-10T00:21:10Z)
Split-Boost Neural Networks [1.1549572298362787]
We propose an innovative training strategy for feed-forward architectures - called split-boost. Such a novel approach ultimately allows us to avoid explicitly modeling the regularization term. The proposed strategy is tested on a real-world (anonymized) dataset within a benchmark medical insurance design problem.
arXiv Detail & Related papers (2023-09-06T17:08:57Z)
Improving Multi-fidelity Optimization with a Recurring Learning Rate for Hyperparameter Tuning [7.591442522626255]
We propose Multi-fidelity Optimization with a Recurring Learning rate (MORL) MORL incorporates CNNs' optimization process into multi-fidelity optimization. It alleviates the problem of slow-starter and achieves a more precise low-fidelity approximation.
arXiv Detail & Related papers (2022-09-26T08:16:31Z)
Joint inference and input optimization in equilibrium networks [68.63726855991052]
deep equilibrium model is a class of models that foregoes traditional network depth and instead computes the output of a network by finding the fixed point of a single nonlinear layer. We show that there is a natural synergy between these two settings. We demonstrate this strategy on various tasks such as training generative models while optimizing over latent codes, training models for inverse problems like denoising and inpainting, adversarial training and gradient based meta-learning.
arXiv Detail & Related papers (2021-11-25T19:59:33Z)
Learning from Images: Proactive Caching with Parallel Convolutional Neural Networks [94.85780721466816]
A novel framework for proactive caching is proposed in this paper. It combines model-based optimization with data-driven techniques by transforming an optimization problem into a grayscale image. Numerical results show that the proposed scheme can reduce 71.6% computation time with only 0.8% additional performance cost.
arXiv Detail & Related papers (2021-08-15T21:32:47Z)
Online hyperparameter optimization by real-time recurrent learning [57.01871583756586]
Our framework takes advantage of the analogy between hyperparameter optimization and parameter learning in neural networks (RNNs) It adapts a well-studied family of online learning algorithms for RNNs to tune hyperparameters and network parameters simultaneously. This procedure yields systematically better generalization performance compared to standard methods, at a fraction of wallclock time.
arXiv Detail & Related papers (2021-02-15T19:36:18Z)

This list is automatically generated from the titles and abstracts of the papers in this site.