Related papers: Hessian-guided Perturbed Wasserstein Gradient Flows for Escaping Saddle Points

Hessian-guided Perturbed Wasserstein Gradient Flows for Escaping Saddle Points

URL: http://arxiv.org/abs/2509.16974v1
Date: Sun, 21 Sep 2025 08:14:20 GMT
Title: Hessian-guided Perturbed Wasserstein Gradient Flows for Escaping Saddle Points
Authors: Naoya Yamamoto, Juno Kim, Taiji Suzuki,
Abstract summary: Wasserstein flow (WGF) is a common method to perform optimization over the space of measures.<n>We show that PWGF converges to a global optimum in terms of general non objectives.
Score: 54.06226763868876
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Wasserstein gradient flow (WGF) is a common method to perform optimization over the space of probability measures. While WGF is guaranteed to converge to a first-order stationary point, for nonconvex functionals the converged solution does not necessarily satisfy the second-order optimality condition; i.e., it could converge to a saddle point. In this work, we propose a new algorithm for probability measure optimization, perturbed Wasserstein gradient flow (PWGF), that achieves second-order optimality for general nonconvex objectives. PWGF enhances WGF by injecting noisy perturbations near saddle points via a Gaussian process-based scheme. By pushing the measure forward along a random vector field generated from a Gaussian process, PWGF helps the solution escape saddle points efficiently by perturbing the solution towards the smallest eigenvalue direction of the Wasserstein Hessian. We theoretically derive the computational complexity for PWGF to achieve a second-order stationary point. Furthermore, we prove that PWGF converges to a global optimum in polynomial time for strictly benign objectives.

Related papers

Randomized Gradient Descents on Riemannian Manifolds: Almost Sure Convergence to Global Minima in and beyond Quantum Optimization [0.0]
We prove that randomized gradient descent methods converge to a single local optimum almost surely despite the existence of saddle points.<n>As a major application, we consider ground-state preparation through quantum optimization over the unitary group.
arXiv Detail & Related papers (2024-05-20T14:06:45Z)
Continuous-time Riemannian SGD and SVRG Flows on Wasserstein Probabilistic Space [17.13355049019388]
We extend the gradient flow on Wasserstein space into the gradient descent (SGD) flow and variance reduction (SVRG) flow. By leveraging the property of Wasserstein space, we construct differential equations to approximate the corresponding discrete dynamics in Euclidean space. Our results are proven, which match the results in Euclidean space.
arXiv Detail & Related papers (2024-01-24T15:35:44Z)
Randomized Forward Mode of Automatic Differentiation For Optimization Algorithms [0.0]
We present a randomized forward mode gradient (RFG) as an alternative to backpropagation. The probability distribution of the random vector determines the statistical properties of RFG. By replacing gradient with RFG, a class of RFG-based optimization algorithms is obtained.
arXiv Detail & Related papers (2023-10-22T04:02:39Z)
Fast Computation of Optimal Transport via Entropy-Regularized Extragradient Methods [75.34939761152587]
Efficient computation of the optimal transport distance between two distributions serves as an algorithm that empowers various applications. This paper develops a scalable first-order optimization-based method that computes optimal transport to within $varepsilon$ additive accuracy.
arXiv Detail & Related papers (2023-01-30T15:46:39Z)
Gradient-Free Methods for Deterministic and Stochastic Nonsmooth Nonconvex Optimization [94.19177623349947]
Non-smooth non optimization problems emerge in machine learning and business making. Two core challenges impede the development of efficient methods with finitetime convergence guarantee. Two-phase versions of GFM and SGFM are also proposed and proven to achieve improved large-deviation results.
arXiv Detail & Related papers (2022-09-12T06:53:24Z)
High Probability Complexity Bounds for Non-Smooth Stochastic Optimization with Heavy-Tailed Noise [51.31435087414348]
It is essential to theoretically guarantee that algorithms provide small objective residual with high probability. Existing methods for non-smooth convex optimization have complexity bounds with dependence on confidence level. We propose novel stepsize rules for two methods with gradient clipping.
arXiv Detail & Related papers (2021-06-10T17:54:21Z)
Zeroth-Order Hybrid Gradient Descent: Towards A Principled Black-Box Optimization Framework [100.36569795440889]
This work is on the iteration of zero-th-order (ZO) optimization which does not require first-order information. We show that with a graceful design in coordinate importance sampling, the proposed ZO optimization method is efficient both in terms of complexity as well as as function query cost.
arXiv Detail & Related papers (2020-12-21T17:29:58Z)
On The Convergence of Euler Discretization of Finite-Time Convergent Gradient Flows [4.401622714202886]
We investigate the performance of two novel first-order optimization algorithms, namely the rescaled-gradient flow (RGF) and the signed-gradient flow (SGF)<n>These algorithms are derived from the forward discretization of finite-time convergent flows, comprised of non-Lipschitz dynamical systems, which locally converge to the minima of gradient-linear functions.
arXiv Detail & Related papers (2020-10-06T19:28:00Z)
An adaptive stochastic gradient-free approach for high-dimensional blackbox optimization [0.0]
We propose an adaptive gradient-free (ASGF) approach for high-dimensional non-smoothing problems. We illustrate the performance of this method on benchmark global problems and learning tasks.
arXiv Detail & Related papers (2020-06-18T22:47:58Z)
Convergence of adaptive algorithms for weakly convex constrained optimization [59.36386973876765]
We prove the $mathcaltilde O(t-1/4)$ rate of convergence for the norm of the gradient of Moreau envelope. Our analysis works with mini-batch size of $1$, constant first and second order moment parameters, and possibly smooth optimization domains.
arXiv Detail & Related papers (2020-06-11T17:43:19Z)

This list is automatically generated from the titles and abstracts of the papers in this site.