Gaussian Process Bandit Optimization of the Thermodynamic Variational
Objective
- URL: http://arxiv.org/abs/2010.15750v3
- Date: Fri, 20 Nov 2020 22:32:04 GMT
- Title: Gaussian Process Bandit Optimization of the Thermodynamic Variational
Objective
- Authors: Vu Nguyen, Vaden Masrani, Rob Brekelmans, Michael A. Osborne, Frank
Wood
- Abstract summary: This paper introduces a bespoke Gaussian process bandit optimization method for automatically choosing sorted discretization points.
We provide theoretical guarantees that our bandit optimization converges to the regret-minimizing choice of integration points.
Empirical validation of our algorithm is provided in terms of improved learning and inference in Variational Autoencoders and Sigmoid Belief Networks.
- Score: 36.062939523856066
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Achieving the full promise of the Thermodynamic Variational Objective (TVO),
a recently proposed variational lower bound on the log evidence involving a
one-dimensional Riemann integral approximation, requires choosing a "schedule"
of sorted discretization points. This paper introduces a bespoke Gaussian
process bandit optimization method for automatically choosing these points. Our
approach not only automates their one-time selection, but also dynamically
adapts their positions over the course of optimization, leading to improved
model learning and inference. We provide theoretical guarantees that our bandit
optimization converges to the regret-minimizing choice of integration points.
Empirical validation of our algorithm is provided in terms of improved learning
and inference in Variational Autoencoders and Sigmoid Belief Networks.
Related papers
- Numerical Optimization for Tensor Disentanglement [7.88541926763416]
This paper focuses on tensor disentangling, the task of identifying transformations that reduce bond dimensions by exploiting gauge freedom in the network.<n>We formulate this problem as an optimization problem over orthogonal matrices acting on a single tensor's indices, aiming to minimize the rank of its matricized form.<n>To seek the often unknown optimal rank, we introduce a binary search strategy integrated with the disentangling procedure.
arXiv Detail & Related papers (2025-08-26T20:17:48Z) - Sample-efficient Bayesian Optimisation Using Known Invariances [56.34916328814857]
We show that vanilla and constrained BO algorithms are inefficient when optimising invariant objectives.
We derive a bound on the maximum information gain of these invariant kernels.
We use our method to design a current drive system for a nuclear fusion reactor, finding a high-performance solution.
arXiv Detail & Related papers (2024-10-22T12:51:46Z) - Enhancing Gaussian Process Surrogates for Optimization and Posterior Approximation via Random Exploration [2.984929040246293]
novel noise-free Bayesian optimization strategies that rely on a random exploration step to enhance the accuracy of Gaussian process surrogate models.
New algorithms retain the ease of implementation of the classical GP-UCB, but an additional exploration step facilitates their convergence.
arXiv Detail & Related papers (2024-01-30T14:16:06Z) - Analyzing and Enhancing the Backward-Pass Convergence of Unrolled
Optimization [50.38518771642365]
The integration of constrained optimization models as components in deep networks has led to promising advances on many specialized learning tasks.
A central challenge in this setting is backpropagation through the solution of an optimization problem, which often lacks a closed form.
This paper provides theoretical insights into the backward pass of unrolled optimization, showing that it is equivalent to the solution of a linear system by a particular iterative method.
A system called Folded Optimization is proposed to construct more efficient backpropagation rules from unrolled solver implementations.
arXiv Detail & Related papers (2023-12-28T23:15:18Z) - Backpropagation of Unrolled Solvers with Folded Optimization [55.04219793298687]
The integration of constrained optimization models as components in deep networks has led to promising advances on many specialized learning tasks.
One typical strategy is algorithm unrolling, which relies on automatic differentiation through the operations of an iterative solver.
This paper provides theoretical insights into the backward pass of unrolled optimization, leading to a system for generating efficiently solvable analytical models of backpropagation.
arXiv Detail & Related papers (2023-01-28T01:50:42Z) - Adaptive Zeroth-Order Optimisation of Nonconvex Composite Objectives [1.7640556247739623]
We analyze algorithms for zeroth-order entropy composite objectives, focusing on dependence on dimensionality.
This is achieved by exploiting low dimensional structure of the decision set using the mirror descent method with an estimation alike function.
To improve the gradient, we replace the classic sampling method based on Rademacher and show that the mini-batch method copes with non-Eucli geometry.
arXiv Detail & Related papers (2022-08-09T07:36:25Z) - An Adaptive Gradient Method with Energy and Momentum [0.0]
We introduce a novel algorithm for gradient-based optimization of objective functions.
The method is simple to implement, computationally efficient, and well suited for large-scale machine learning problems.
arXiv Detail & Related papers (2022-03-23T04:48:38Z) - A theoretical and empirical study of new adaptive algorithms with
additional momentum steps and shifted updates for stochastic non-convex
optimization [0.0]
It is thought that adaptive optimization algorithms represent the key pillar behind the of the Learning field.
In this paper we introduce adaptive momentum techniques for different non-smooth objective problems.
arXiv Detail & Related papers (2021-10-16T09:47:57Z) - Momentum Accelerates the Convergence of Stochastic AUPRC Maximization [80.8226518642952]
We study optimization of areas under precision-recall curves (AUPRC), which is widely used for imbalanced tasks.
We develop novel momentum methods with a better iteration of $O (1/epsilon4)$ for finding an $epsilon$stationary solution.
We also design a novel family of adaptive methods with the same complexity of $O (1/epsilon4)$, which enjoy faster convergence in practice.
arXiv Detail & Related papers (2021-07-02T16:21:52Z) - Real-Time Optimization Meets Bayesian Optimization and Derivative-Free
Optimization: A Tale of Modifier Adaptation [0.0]
This paper investigates a new class of modifier-adaptation schemes to overcome plant-model mismatch in real-time optimization of uncertain processes.
The proposed schemes embed a physical model and rely on trust-region ideas to minimize risk during the exploration.
The benefits of using an acquisition function, knowing the process noise level, or specifying a nominal process model are illustrated.
arXiv Detail & Related papers (2020-09-18T12:57:17Z) - Robust, Accurate Stochastic Optimization for Variational Inference [68.83746081733464]
We show that common optimization methods lead to poor variational approximations if the problem is moderately large.
Motivated by these findings, we develop a more robust and accurate optimization framework by viewing the underlying algorithm as producing a Markov chain.
arXiv Detail & Related papers (2020-09-01T19:12:11Z) - Convergence of adaptive algorithms for weakly convex constrained
optimization [59.36386973876765]
We prove the $mathcaltilde O(t-1/4)$ rate of convergence for the norm of the gradient of Moreau envelope.
Our analysis works with mini-batch size of $1$, constant first and second order moment parameters, and possibly smooth optimization domains.
arXiv Detail & Related papers (2020-06-11T17:43:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.