Related papers: Hyper-parameter estimation method with particle swarm optimization

Hyper-parameter estimation method with particle swarm optimization

URL: http://arxiv.org/abs/2011.11944v2
Date: Mon, 14 Dec 2020 04:16:34 GMT
Title: Hyper-parameter estimation method with particle swarm optimization
Authors: Yaru Li, Yulai Zhang
Abstract summary: The PSO method cannot be directly used in the problem of hyper- parameters estimation. The proposed method uses the swarm method to optimize the performance of the acquisition function. The results on several problems are improved.
Score: 0.8883733362171032
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Particle swarm optimization (PSO) method cannot be directly used in the problem of hyper-parameter estimation since the mathematical formulation of the mapping from hyper-parameters to loss function or generalization accuracy is unclear. Bayesian optimization (BO) framework is capable of converting the optimization of the hyper-parameters into the optimization of an acquisition function. The acquisition function is non-convex and multi-peak. So the problem can be better solved by the PSO. The proposed method in this paper uses the particle swarm method to optimize the acquisition function in the BO framework to get better hyper-parameters. The performances of proposed method in both of the classification and regression models are evaluated and demonstrated. The results on several benchmark problems are improved.

Related papers

How to Prove the Optimized Values of Hyperparameters for Particle Swarm Optimization? [0.0]
This study proposes an analytic framework to analyze the optimized average-fitness-function-value (AFFV) based on mathematical models for a variety of fitness functions. Experimental results show that the hyper parameter values from the proposed method can obtain higher efficiency convergences and lower AFFVs.
arXiv Detail & Related papers (2023-02-01T00:33:35Z)
Generalizing Bayesian Optimization with Decision-theoretic Entropies [102.82152945324381]
We consider a generalization of Shannon entropy from work in statistical decision theory. We first show that special cases of this entropy lead to popular acquisition functions used in BO procedures. We then show how alternative choices for the loss yield a flexible family of acquisition functions.
arXiv Detail & Related papers (2022-10-04T04:43:58Z)
A Globally Convergent Gradient-based Bilevel Hyperparameter Optimization Method [0.0]
We propose a gradient-based bilevel method for solving the hyperparameter optimization problem. We show that the proposed method converges with lower computation and leads to models that generalize better on the testing set.
arXiv Detail & Related papers (2022-08-25T14:25:16Z)
Towards Learning Universal Hyperparameter Optimizers with Transformers [57.35920571605559]
We introduce the OptFormer, the first text-based Transformer HPO framework that provides a universal end-to-end interface for jointly learning policy and function prediction. Our experiments demonstrate that the OptFormer can imitate at least 7 different HPO algorithms, which can be further improved via its function uncertainty estimates.
arXiv Detail & Related papers (2022-05-26T12:51:32Z)
Hyper-parameter optimization based on soft actor critic and hierarchical mixture regularization [5.063728016437489]
We model hyper- parameter optimization process as a Markov decision process, and tackle it with reinforcement learning. A novel hyper- parameter optimization method based on soft actor critic and hierarchical mixture regularization has been proposed.
arXiv Detail & Related papers (2021-12-08T02:34:43Z)
Reducing the Variance of Gaussian Process Hyperparameter Optimization with Preconditioning [54.01682318834995]
Preconditioning is a highly effective step for any iterative method involving matrix-vector multiplication. We prove that preconditioning has an additional benefit that has been previously unexplored. It simultaneously can reduce variance at essentially negligible cost.
arXiv Detail & Related papers (2021-07-01T06:43:11Z)
Implicit differentiation for fast hyperparameter selection in non-smooth convex learning [87.60600646105696]
We study first-order methods when the inner optimization problem is convex but non-smooth. We show that the forward-mode differentiation of proximal gradient descent and proximal coordinate descent yield sequences of Jacobians converging toward the exact Jacobian.
arXiv Detail & Related papers (2021-05-04T17:31:28Z)
Optimizing Large-Scale Hyperparameters via Automated Learning Algorithm [97.66038345864095]
We propose a new hyperparameter optimization method with zeroth-order hyper-gradients (HOZOG) Specifically, we first formulate hyperparameter optimization as an A-based constrained optimization problem. Then, we use the average zeroth-order hyper-gradients to update hyper parameters.
arXiv Detail & Related papers (2021-02-17T21:03:05Z)
Efficient hyperparameter optimization by way of PAC-Bayes bound minimization [4.191847852775072]
We present an alternative objective that is equivalent to a Probably Approximately Correct-Bayes (PAC-Bayes) bound on the expected out-of-sample error. We then devise an efficient gradient-based algorithm to minimize this objective.
arXiv Detail & Related papers (2020-08-14T15:54:51Z)
Online Hyperparameter Search Interleaved with Proximal Parameter Updates [9.543667840503739]
We develop a method that relies on the structure of proximal gradient methods and does not require a smooth cost function. Such a method is applied to Leave-one-out (LOO)-validated Lasso and Group Lasso. Numerical experiments corroborate the convergence of the proposed method to a local optimum of the LOO validation error curve.
arXiv Detail & Related papers (2020-04-06T15:54:03Z)
Implicit differentiation of Lasso-type models for hyperparameter optimization [82.73138686390514]
We introduce an efficient implicit differentiation algorithm, without matrix inversion, tailored for Lasso-type problems. Our approach scales to high-dimensional data by leveraging the sparsity of the solutions.
arXiv Detail & Related papers (2020-02-20T18:43:42Z)

This list is automatically generated from the titles and abstracts of the papers in this site.