Related papers: Scalable Hyperparameter Optimization with Lazy Gaussian Processes

Scalable Hyperparameter Optimization with Lazy Gaussian Processes

URL: http://arxiv.org/abs/2001.05726v1
Date: Thu, 16 Jan 2020 10:15:55 GMT
Title: Scalable Hyperparameter Optimization with Lazy Gaussian Processes
Authors: Raju Ram, Sabine M\"uller, Franz-Josef Pfreundt, Nicolas R. Gauger, Janis Keuper
Abstract summary: We present a novel, highly accurate approximation of the underlying Gaussian Process. The first experiments show speedups of a factor of 162 in single node and further speed up by a factor of 5 in a parallel environment.
Score: 1.3999481573773074
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Most machine learning methods require careful selection of hyper-parameters in order to train a high performing model with good generalization abilities. Hence, several automatic selection algorithms have been introduced to overcome tedious manual (try and error) tuning of these parameters. Due to its very high sample efficiency, Bayesian Optimization over a Gaussian Processes modeling of the parameter space has become the method of choice. Unfortunately, this approach suffers from a cubic compute complexity due to underlying Cholesky factorization, which makes it very hard to be scaled beyond a small number of sampling steps. In this paper, we present a novel, highly accurate approximation of the underlying Gaussian Process. Reducing its computational complexity from cubic to quadratic allows an efficient strong scaling of Bayesian Optimization while outperforming the previous approach regarding optimization accuracy. The first experiments show speedups of a factor of 162 in single node and further speed up by a factor of 5 in a parallel environment.

Related papers

Extrapolation method to optimize linear-ramp QAOA parameters: Evaluation of QAOA runtime scaling [0.0]
The linear-ramp QAOA has been proposed to address this issue, as it relies on only two parameters which have to be optimized. We apply this method to several use cases such as portfolio optimization, feature selection and clustering, and compare the quantum runtime scaling with that of classical methods.
arXiv Detail & Related papers (2025-04-11T14:30:26Z)
Constructing Gaussian Processes via Samplets [0.0]
We examine recent convergence results to identify models with optimal convergence rates. We propose a Samplet-based approach to efficiently construct and train the Gaussian Processes.
arXiv Detail & Related papers (2024-11-11T18:01:03Z)
Enhancing Gaussian Process Surrogates for Optimization and Posterior Approximation via Random Exploration [2.984929040246293]
novel noise-free Bayesian optimization strategies that rely on a random exploration step to enhance the accuracy of Gaussian process surrogate models. New algorithms retain the ease of implementation of the classical GP-UCB, but an additional exploration step facilitates their convergence.
arXiv Detail & Related papers (2024-01-30T14:16:06Z)
Fast Computation of Optimal Transport via Entropy-Regularized Extragradient Methods [75.34939761152587]
Efficient computation of the optimal transport distance between two distributions serves as an algorithm that empowers various applications. This paper develops a scalable first-order optimization-based method that computes optimal transport to within $varepsilon$ additive accuracy.
arXiv Detail & Related papers (2023-01-30T15:46:39Z)
Reducing the Variance of Gaussian Process Hyperparameter Optimization with Preconditioning [54.01682318834995]
Preconditioning is a highly effective step for any iterative method involving matrix-vector multiplication. We prove that preconditioning has an additional benefit that has been previously unexplored. It simultaneously can reduce variance at essentially negligible cost.
arXiv Detail & Related papers (2021-07-01T06:43:11Z)
Implicit differentiation for fast hyperparameter selection in non-smooth convex learning [87.60600646105696]
We study first-order methods when the inner optimization problem is convex but non-smooth. We show that the forward-mode differentiation of proximal gradient descent and proximal coordinate descent yield sequences of Jacobians converging toward the exact Jacobian.
arXiv Detail & Related papers (2021-05-04T17:31:28Z)
Hyper-optimization with Gaussian Process and Differential Evolution Algorithm [0.0]
This paper presents specific modifications of Gaussian Process optimization components from available scientific libraries. presented modifications were submitted to BlackBox 2020 challenge, where it outperformed some conventionally available optimization libraries.
arXiv Detail & Related papers (2021-01-26T08:33:00Z)
Efficient hyperparameter optimization by way of PAC-Bayes bound minimization [4.191847852775072]
We present an alternative objective that is equivalent to a Probably Approximately Correct-Bayes (PAC-Bayes) bound on the expected out-of-sample error. We then devise an efficient gradient-based algorithm to minimize this objective.
arXiv Detail & Related papers (2020-08-14T15:54:51Z)
Balancing Rates and Variance via Adaptive Batch-Size for Stochastic Optimization Problems [120.21685755278509]
In this work, we seek to balance the fact that attenuating step-size is required for exact convergence with the fact that constant step-size learns faster in time up to an error. Rather than fixing the minibatch the step-size at the outset, we propose to allow parameters to evolve adaptively.
arXiv Detail & Related papers (2020-07-02T16:02:02Z)
Global Optimization of Gaussian processes [52.77024349608834]
We propose a reduced-space formulation with trained Gaussian processes trained on few data points. The approach also leads to significantly smaller and computationally cheaper sub solver for lower bounding. In total, we reduce time convergence by orders of orders of the proposed method.
arXiv Detail & Related papers (2020-05-21T20:59:11Z)
Distributed Averaging Methods for Randomized Second Order Optimization [54.51566432934556]
We consider distributed optimization problems where forming the Hessian is computationally challenging and communication is a bottleneck. We develop unbiased parameter averaging methods for randomized second order optimization that employ sampling and sketching of the Hessian. We also extend the framework of second order averaging methods to introduce an unbiased distributed optimization framework for heterogeneous computing systems.
arXiv Detail & Related papers (2020-02-16T09:01:18Z)

This list is automatically generated from the titles and abstracts of the papers in this site.