Related papers: A Unified Gaussian Process for Branching and Nested Hyperparameter Optimization

A Unified Gaussian Process for Branching and Nested Hyperparameter Optimization

URL: http://arxiv.org/abs/2402.04885v1
Date: Fri, 19 Jan 2024 21:11:32 GMT
Title: A Unified Gaussian Process for Branching and Nested Hyperparameter Optimization
Authors: Jiazhao Zhang and Ying Hung and Chung-Ching Lin and Zicheng Liu
Abstract summary: In deep learning, tuning parameters with conditional dependence are common in practice. New GP model accounts for the dependent structure among input variables through a new kernel function. High prediction accuracy and better optimization efficiency are observed in a series of synthetic simulations and real data applications of neural networks.
Score: 19.351804144005744
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Choosing appropriate hyperparameters plays a crucial role in the success of neural networks as hyper-parameters directly control the behavior and performance of the training algorithms. To obtain efficient tuning, Bayesian optimization methods based on Gaussian process (GP) models are widely used. Despite numerous applications of Bayesian optimization in deep learning, the existing methodologies are developed based on a convenient but restrictive assumption that the tuning parameters are independent of each other. However, tuning parameters with conditional dependence are common in practice. In this paper, we focus on two types of them: branching and nested parameters. Nested parameters refer to those tuning parameters that exist only within a particular setting of another tuning parameter, and a parameter within which other parameters are nested is called a branching parameter. To capture the conditional dependence between branching and nested parameters, a unified Bayesian optimization framework is proposed. The sufficient conditions are rigorously derived to guarantee the validity of the kernel function, and the asymptotic convergence of the proposed optimization framework is proven under the continuum-armed-bandit setting. Based on the new GP model, which accounts for the dependent structure among input variables through a new kernel function, higher prediction accuracy and better optimization efficiency are observed in a series of synthetic simulations and real data applications of neural networks. Sensitivity analysis is also performed to provide insights into how changes in hyperparameter values affect prediction accuracy.

Related papers

Continual Adaptation: Environment-Conditional Parameter Generation for Object Detection in Dynamic Scenarios [54.58186816693791]
environments constantly change over time and space, posing significant challenges for object detectors trained based on a closed-set assumption.<n>We propose a new mechanism, converting the fine-tuning process to a specific- parameter generation.<n>In particular, we first design a dual-path LoRA-based domain-aware adapter that disentangles features into domain-invariant and domain-specific components.
arXiv Detail & Related papers (2025-06-30T17:14:12Z)
Scaling Exponents Across Parameterizations and Optimizers [94.54718325264218]
We propose a new perspective on parameterization by investigating a key assumption in prior work. Our empirical investigation includes tens of thousands of models trained with all combinations of threes. We find that the best learning rate scaling prescription would often have been excluded by the assumptions in prior work.
arXiv Detail & Related papers (2024-07-08T12:32:51Z)
Parameter Optimization with Conscious Allocation (POCA) [4.478575931884855]
Hyperband-based approaches to machine learning are among the most effective. We present. the new. Optimization with Conscious Allocation (POCA), a hyperband-based algorithm that adaptively allocates the inputted. budget to the hyperparameter configurations it generates. POCA finds strong configurations faster in both settings.
arXiv Detail & Related papers (2023-12-29T00:13:55Z)
Sensitivity-Aware Visual Parameter-Efficient Fine-Tuning [91.5113227694443]
We propose a novel visual. sensuous-aware fine-Tuning (SPT) scheme. SPT allocates trainable parameters to task-specific important positions. Experiments on a wide range of downstream recognition tasks show that our SPT is complementary to the existing PEFT methods.
arXiv Detail & Related papers (2023-03-15T12:34:24Z)
On the Effectiveness of Parameter-Efficient Fine-Tuning [79.6302606855302]
Currently, many research works propose to only fine-tune a small portion of the parameters while keeping most of the parameters shared across different tasks. We show that all of the methods are actually sparse fine-tuned models and conduct a novel theoretical analysis of them. Despite the effectiveness of sparsity grounded by our theory, it still remains an open problem of how to choose the tunable parameters.
arXiv Detail & Related papers (2022-11-28T17:41:48Z)
Surrogate modeling for Bayesian optimization beyond a single Gaussian process [62.294228304646516]
We propose a novel Bayesian surrogate model to balance exploration with exploitation of the search space. To endow function sampling with scalability, random feature-based kernel approximation is leveraged per GP model. To further establish convergence of the proposed EGP-TS to the global optimum, analysis is conducted based on the notion of Bayesian regret.
arXiv Detail & Related papers (2022-05-27T16:43:10Z)
Online hyperparameter optimization by real-time recurrent learning [57.01871583756586]
Our framework takes advantage of the analogy between hyperparameter optimization and parameter learning in neural networks (RNNs) It adapts a well-studied family of online learning algorithms for RNNs to tune hyperparameters and network parameters simultaneously. This procedure yields systematically better generalization performance compared to standard methods, at a fraction of wallclock time.
arXiv Detail & Related papers (2021-02-15T19:36:18Z)
Efficient Hyperparameter Tuning with Dynamic Accuracy Derivative-Free Optimization [0.27074235008521236]
We apply a recent dynamic accuracy derivative-free optimization method to hyperparameter tuning. This method allows inexact evaluations of the learning problem while retaining convergence guarantees. We demonstrate its robustness and efficiency compared to a fixed accuracy approach.
arXiv Detail & Related papers (2020-11-06T00:59:51Z)
Automatic Setting of DNN Hyper-Parameters by Mixing Bayesian Optimization and Tuning Rules [0.6875312133832078]
We build a new algorithm for evaluating and analyzing the results of the network on the training and validation sets. We use a set of tuning rules to add new hyper-parameters and/or to reduce the hyper- parameter search space to select a better combination.
arXiv Detail & Related papers (2020-06-03T08:53:48Z)
Rethinking the Hyperparameters for Fine-tuning [78.15505286781293]
Fine-tuning from pre-trained ImageNet models has become the de-facto standard for various computer vision tasks. Current practices for fine-tuning typically involve selecting an ad-hoc choice of hyper parameters. This paper re-examines several common practices of setting hyper parameters for fine-tuning.
arXiv Detail & Related papers (2020-02-19T18:59:52Z)
Online Parameter Estimation for Safety-Critical Systems with Gaussian Processes [6.122161391301866]
We present a Bayesian optimization framework based on Gaussian processes (GPs) for online parameter estimation. It uses an efficient search strategy over a response surface in the parameter space for finding the global optima with minimal function evaluations. We demonstrate our technique on an actuated planar pendulum and safety-critical quadrotor in simulation with changing parameters.
arXiv Detail & Related papers (2020-02-18T20:38:00Z)

This list is automatically generated from the titles and abstracts of the papers in this site.