Efficient displacement convex optimization with particle gradient
descent
- URL: http://arxiv.org/abs/2302.04753v1
- Date: Thu, 9 Feb 2023 16:35:59 GMT
- Title: Efficient displacement convex optimization with particle gradient
descent
- Authors: Hadi Daneshmand, Jason D. Lee, and Chi Jin
- Abstract summary: Particle gradient descent is widely used to optimize functions of probability measures.
This paper considers particle gradient descent with a finite number of particles and establishes its theoretical guarantees to optimize functions that are emphdisplacement convex in measures.
- Score: 57.88860627977882
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Particle gradient descent, which uses particles to represent a probability
measure and performs gradient descent on particles in parallel, is widely used
to optimize functions of probability measures. This paper considers particle
gradient descent with a finite number of particles and establishes its
theoretical guarantees to optimize functions that are \emph{displacement
convex} in measures. Concretely, for Lipschitz displacement convex functions
defined on probability over $\mathbb{R}^d$, we prove that $O(1/\epsilon^2)$
particles and $O(d/\epsilon^4)$ computations are sufficient to find the
$\epsilon$-optimal solutions. We further provide improved complexity bounds for
optimizing smooth displacement convex functions. We demonstrate the application
of our results for function approximation with specific neural architectures
with two-dimensional inputs.
Related papers
- Stochastic First-Order Methods with Non-smooth and Non-Euclidean Proximal Terms for Nonconvex High-Dimensional Stochastic Optimization [2.0657831823662574]
When the non problem is by which the non problem is by whichity, the sample of first-order methods may depend linearly on the problem dimension, is for undesirable problems.
Our algorithms allow for the estimate of complexity using the distance of.
mathO (log d) / EuM4.
We prove that DISFOM can sharpen variance employing $mathO (log d) / EuM4.
arXiv Detail & Related papers (2024-06-27T18:38:42Z) - On Convergence of Incremental Gradient for Non-Convex Smooth Functions [63.51187646914962]
In machine learning and network optimization, algorithms like shuffle SGD are popular due to minimizing the number of misses and good cache.
This paper delves into the convergence properties SGD algorithms with arbitrary data ordering.
arXiv Detail & Related papers (2023-05-30T17:47:27Z) - Interacting Particle Langevin Algorithm for Maximum Marginal Likelihood
Estimation [2.53740603524637]
We develop a class of interacting particle systems for implementing a maximum marginal likelihood estimation procedure.
In particular, we prove that the parameter marginal of the stationary measure of this diffusion has the form of a Gibbs measure.
Using a particular rescaling, we then prove geometric ergodicity of this system and bound the discretisation error.
in a manner that is uniform in time and does not increase with the number of particles.
arXiv Detail & Related papers (2023-03-23T16:50:08Z) - Continuized Acceleration for Quasar Convex Functions in Non-Convex
Optimization [13.427128424538502]
Quasar convexity is a condition that allows some first-order to efficiently a function even when the landscape is non-optimal.
We show that quasar convexity is possible under certain conditions, while acceleration is not known in general for these classes of functions.
arXiv Detail & Related papers (2023-02-15T18:35:45Z) - Fast Computation of Optimal Transport via Entropy-Regularized Extragradient Methods [75.34939761152587]
Efficient computation of the optimal transport distance between two distributions serves as an algorithm that empowers various applications.
This paper develops a scalable first-order optimization-based method that computes optimal transport to within $varepsilon$ additive accuracy.
arXiv Detail & Related papers (2023-01-30T15:46:39Z) - Reducing the Variance of Gaussian Process Hyperparameter Optimization
with Preconditioning [54.01682318834995]
Preconditioning is a highly effective step for any iterative method involving matrix-vector multiplication.
We prove that preconditioning has an additional benefit that has been previously unexplored.
It simultaneously can reduce variance at essentially negligible cost.
arXiv Detail & Related papers (2021-07-01T06:43:11Z) - Finding Global Minima via Kernel Approximations [90.42048080064849]
We consider the global minimization of smooth functions based solely on function evaluations.
In this paper, we consider an approach that jointly models the function to approximate and finds a global minimum.
arXiv Detail & Related papers (2020-12-22T12:59:30Z) - Variational Transport: A Convergent Particle-BasedAlgorithm for Distributional Optimization [106.70006655990176]
A distributional optimization problem arises widely in machine learning and statistics.
We propose a novel particle-based algorithm, dubbed as variational transport, which approximately performs Wasserstein gradient descent.
We prove that when the objective function satisfies a functional version of the Polyak-Lojasiewicz (PL) (Polyak, 1963) and smoothness conditions, variational transport converges linearly.
arXiv Detail & Related papers (2020-12-21T18:33:13Z) - Deep FPF: Gain function approximation in high-dimensional setting [8.164433158925592]
We present a novel approach to approximate the gain function of the feedback particle filter (FPF)
The numerical problem is to approximate the exact gain function using only finitely many particles sampled from the probability distribution.
Inspired by the recent success of the deep learning methods, we represent the gain function as a gradient of the output of a neural network.
arXiv Detail & Related papers (2020-10-02T20:17:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.