Stochastic Gauss-Newton Algorithms for Nonconvex Compositional
Optimization
- URL: http://arxiv.org/abs/2002.07290v2
- Date: Thu, 2 Jul 2020 20:41:20 GMT
- Title: Stochastic Gauss-Newton Algorithms for Nonconvex Compositional
Optimization
- Authors: Quoc Tran-Dinh and Nhan H. Pham and Lam M. Nguyen
- Abstract summary: We develop two new Gauss-Newton algorithms for solving a class of non- compositional optimization problems.
We consider both the expectation and finite-sum settings under standard assumptions.
- Score: 26.313415590777858
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We develop two new stochastic Gauss-Newton algorithms for solving a class of
non-convex stochastic compositional optimization problems frequently arising in
practice. We consider both the expectation and finite-sum settings under
standard assumptions, and use both classical stochastic and SARAH estimators
for approximating function values and Jacobians. In the expectation case, we
establish $\mathcal{O}(\varepsilon^{-2})$ iteration-complexity to achieve a
stationary point in expectation and estimate the total number of stochastic
oracle calls for both function value and its Jacobian, where $\varepsilon$ is a
desired accuracy. In the finite sum case, we also estimate
$\mathcal{O}(\varepsilon^{-2})$ iteration-complexity and the total oracle calls
with high probability. To our best knowledge, this is the first time such
global stochastic oracle complexity is established for stochastic Gauss-Newton
methods. Finally, we illustrate our theoretical results via two numerical
examples on both synthetic and real datasets.
Related papers
- Obtaining Lower Query Complexities through Lightweight Zeroth-Order Proximal Gradient Algorithms [65.42376001308064]
We propose two variance reduced ZO estimators for complex gradient problems.
We improve the state-of-the-art function complexities from $mathcalOleft(minfracdn1/2epsilon2, fracdepsilon3right)$ to $tildecalOleft(fracdepsilon2right)$.
arXiv Detail & Related papers (2024-10-03T15:04:01Z) - Fully Zeroth-Order Bilevel Programming via Gaussian Smoothing [7.143879014059895]
We study and analyze zeroth-order approximation algorithms for solving bilvel problems.
To the best of our knowledge, this is the first time that sample bounds are established for a fully zeroth-order bilevel optimization algorithm.
arXiv Detail & Related papers (2024-03-29T21:12:25Z) - Stochastic Halpern iteration in normed spaces and applications to reinforcement learning [0.30693357740321775]
We show that if the underlying oracle is uniformly bounded, our method exhibits an overall oracle complexity of $tildeO(varepsilon-5)$.
We propose new synchronous algorithms for average reward and discounted reward Markov decision processes.
arXiv Detail & Related papers (2024-03-19T01:07:35Z) - Stochastic Optimization for Non-convex Problem with Inexact Hessian
Matrix, Gradient, and Function [99.31457740916815]
Trust-region (TR) and adaptive regularization using cubics have proven to have some very appealing theoretical properties.
We show that TR and ARC methods can simultaneously provide inexact computations of the Hessian, gradient, and function values.
arXiv Detail & Related papers (2023-10-18T10:29:58Z) - A Newton-CG based barrier-augmented Lagrangian method for general nonconvex conic optimization [53.044526424637866]
In this paper we consider finding an approximate second-order stationary point (SOSP) that minimizes a twice different subject general non conic optimization.
In particular, we propose a Newton-CG based-augmentedconjugate method for finding an approximate SOSP.
arXiv Detail & Related papers (2023-01-10T20:43:29Z) - Stochastic Inexact Augmented Lagrangian Method for Nonconvex Expectation
Constrained Optimization [88.0031283949404]
Many real-world problems have complicated non functional constraints and use a large number of data points.
Our proposed method outperforms an existing method with the previously best-known result.
arXiv Detail & Related papers (2022-12-19T14:48:54Z) - Optimal Extragradient-Based Bilinearly-Coupled Saddle-Point Optimization [116.89941263390769]
We consider the smooth convex-concave bilinearly-coupled saddle-point problem, $min_mathbfxmax_mathbfyF(mathbfx) + H(mathbfx,mathbfy)$, where one has access to first-order oracles for $F$, $G$ as well as the bilinear coupling function $H$.
We present a emphaccelerated gradient-extragradient (AG-EG) descent-ascent algorithm that combines extragrad
arXiv Detail & Related papers (2022-06-17T06:10:20Z) - Stochastic Zeroth order Descent with Structured Directions [10.604744518360464]
We introduce and analyze Structured Zeroth order Descent (SSZD), a finite difference approach that approximates a gradient on a set $lleq d directions, where $d is the dimension of the ambient space.
For convex convex we prove almost sure convergence of functions on $O( (d/l) k-c1/2$)$ for every $c1/2$, which is arbitrarily close to the one of the Gradient Descent (SGD) in terms of one number of iterations.
arXiv Detail & Related papers (2022-06-10T14:00:06Z) - A Projection-free Algorithm for Constrained Stochastic Multi-level
Composition Optimization [12.096252285460814]
We propose a projection-free conditional gradient-type algorithm for composition optimization.
We show that the number of oracles and the linear-minimization oracle required by the proposed algorithm, are of order $mathcalO_T(epsilon-2)$ and $mathcalO_T(epsilon-3)$ respectively.
arXiv Detail & Related papers (2022-02-09T06:05:38Z) - Optimal Robust Linear Regression in Nearly Linear Time [97.11565882347772]
We study the problem of high-dimensional robust linear regression where a learner is given access to $n$ samples from the generative model $Y = langle X,w* rangle + epsilon$
We propose estimators for this problem under two settings: (i) $X$ is L4-L2 hypercontractive, $mathbbE [XXtop]$ has bounded condition number and $epsilon$ has bounded variance and (ii) $X$ is sub-Gaussian with identity second moment and $epsilon$ is
arXiv Detail & Related papers (2020-07-16T06:44:44Z) - Stochastic Recursive Variance Reduction for Efficient Smooth Non-Convex
Compositional Optimization [20.410012564838933]
compositional optimization arises in many important machine learning tasks such as value function evaluation in reinforcement learning and portfolio management.
In this paper, we investigate the general compositional optimization in the general smooth non-cursive setting.
arXiv Detail & Related papers (2019-12-31T18:59:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.