Stochastic Zeroth-order Riemannian Derivative Estimation and
Optimization
- URL: http://arxiv.org/abs/2003.11238v3
- Date: Tue, 5 Jan 2021 04:05:52 GMT
- Title: Stochastic Zeroth-order Riemannian Derivative Estimation and
Optimization
- Authors: Jiaxiang Li, Krishnakumar Balasubramanian, Shiqian Ma
- Abstract summary: We propose an oracle version of the Gaussian smoothing function to overcome the difficulty of non-linearity of manifold non-linearity.
We demonstrate the applicability of our algorithms by results and real-world applications on black-box stiffness control for robotics and black-box attacks to neural networks.
- Score: 15.78743548731191
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We consider stochastic zeroth-order optimization over Riemannian submanifolds
embedded in Euclidean space, where the task is to solve Riemannian optimization
problem with only noisy objective function evaluations. Towards this, our main
contribution is to propose estimators of the Riemannian gradient and Hessian
from noisy objective function evaluations, based on a Riemannian version of the
Gaussian smoothing technique. The proposed estimators overcome the difficulty
of the non-linearity of the manifold constraint and the issues that arise in
using Euclidean Gaussian smoothing techniques when the function is defined only
over the manifold. We use the proposed estimators to solve Riemannian
optimization problems in the following settings for the objective function: (i)
stochastic and gradient-Lipschitz (in both nonconvex and geodesic convex
settings), (ii) sum of gradient-Lipschitz and non-smooth functions, and (iii)
Hessian-Lipschitz. For these settings, we analyze the oracle complexity of our
algorithms to obtain appropriately defined notions of $\epsilon$-stationary
point or $\epsilon$-approximate local minimizer. Notably, our complexities are
independent of the dimension of the ambient Euclidean space and depend only on
the intrinsic dimension of the manifold under consideration. We demonstrate the
applicability of our algorithms by simulation results and real-world
applications on black-box stiffness control for robotics and black-box attacks
to neural networks.
Related papers
- Stochastic Zeroth-Order Optimization under Strongly Convexity and Lipschitz Hessian: Minimax Sample Complexity [59.75300530380427]
We consider the problem of optimizing second-order smooth and strongly convex functions where the algorithm is only accessible to noisy evaluations of the objective function it queries.
We provide the first tight characterization for the rate of the minimax simple regret by developing matching upper and lower bounds.
arXiv Detail & Related papers (2024-06-28T02:56:22Z) - Riemannian Bilevel Optimization [35.42472057648458]
We focus in particular on batch and gradient-based methods, with the explicit goal of avoiding second-order information.
We propose and analyze $mathrmRF2SA$, a method that leverages first-order gradient information.
We provide explicit convergence rates for reaching $epsilon$-stationary points under various setups.
arXiv Detail & Related papers (2024-05-22T20:49:01Z) - Convergence and complexity of block majorization-minimization for constrained block-Riemannian optimization [18.425648833592312]
Blockization-minimization (BMM) is a simple iterative gradient for non-exhaustive subspace estimation.
Our analysis makes explicit use of Euclidian constraints.
arXiv Detail & Related papers (2023-12-16T05:40:19Z) - Riemannian stochastic optimization methods avoid strict saddle points [68.80251170757647]
We show that policies under study avoid strict saddle points / submanifolds with probability 1.
This result provides an important sanity check as it shows that, almost always, the limit state of an algorithm can only be a local minimizer.
arXiv Detail & Related papers (2023-11-04T11:12:24Z) - Curvature-Independent Last-Iterate Convergence for Games on Riemannian
Manifolds [77.4346324549323]
We show that a step size agnostic to the curvature of the manifold achieves a curvature-independent and linear last-iterate convergence rate.
To the best of our knowledge, the possibility of curvature-independent rates and/or last-iterate convergence has not been considered before.
arXiv Detail & Related papers (2023-06-29T01:20:44Z) - Adaptive Zeroth-Order Optimisation of Nonconvex Composite Objectives [1.7640556247739623]
We analyze algorithms for zeroth-order entropy composite objectives, focusing on dependence on dimensionality.
This is achieved by exploiting low dimensional structure of the decision set using the mirror descent method with an estimation alike function.
To improve the gradient, we replace the classic sampling method based on Rademacher and show that the mini-batch method copes with non-Eucli geometry.
arXiv Detail & Related papers (2022-08-09T07:36:25Z) - First-Order Algorithms for Min-Max Optimization in Geodesic Metric
Spaces [93.35384756718868]
min-max algorithms have been analyzed in the Euclidean setting.
We prove that the extraiteient (RCEG) method corrected lastrate convergence at a linear rate.
arXiv Detail & Related papers (2022-06-04T18:53:44Z) - A Riemannian smoothing steepest descent method for non-Lipschitz
optimization on submanifolds [15.994495285950293]
A smoothing steepest descent method is proposed to minimize a non-Lipschitz function on submanifolds.
We prove any point of the sequence generated by the smoothing function is a limiting point of the original non-Lipschitz problem.
Numerical experiments are conducted to demonstrate the efficiency of the proposed method.
arXiv Detail & Related papers (2021-04-09T05:38:28Z) - Zeroth-Order Hybrid Gradient Descent: Towards A Principled Black-Box
Optimization Framework [100.36569795440889]
This work is on the iteration of zero-th-order (ZO) optimization which does not require first-order information.
We show that with a graceful design in coordinate importance sampling, the proposed ZO optimization method is efficient both in terms of complexity as well as as function query cost.
arXiv Detail & Related papers (2020-12-21T17:29:58Z) - Convergence of adaptive algorithms for weakly convex constrained
optimization [59.36386973876765]
We prove the $mathcaltilde O(t-1/4)$ rate of convergence for the norm of the gradient of Moreau envelope.
Our analysis works with mini-batch size of $1$, constant first and second order moment parameters, and possibly smooth optimization domains.
arXiv Detail & Related papers (2020-06-11T17:43:19Z) - Riemannian Stochastic Proximal Gradient Methods for Nonsmooth
Optimization over the Stiefel Manifold [7.257751371276488]
R-ProxSGD and R-ProxSPB are generalizations of proximal SGD and proximal SpiderBoost.
R-ProxSPB algorithm finds an $epsilon$-stationary point with $O(epsilon-3)$ IFOs in the online case, and $O(n+sqrtnepsilon-3)$ IFOs in the finite-sum case.
arXiv Detail & Related papers (2020-05-03T23:41:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.