Related papers: Stochastic Gauss-Newton Algorithms for Nonconvex Compositional Optimization

Stochastic Gauss-Newton Algorithms for Nonconvex Compositional Optimization

URL: http://arxiv.org/abs/2002.07290v2
Date: Thu, 2 Jul 2020 20:41:20 GMT
Title: Stochastic Gauss-Newton Algorithms for Nonconvex Compositional Optimization
Authors: Quoc Tran-Dinh and Nhan H. Pham and Lam M. Nguyen
Abstract summary: We develop two new Gauss-Newton algorithms for solving a class of non- compositional optimization problems. We consider both the expectation and finite-sum settings under standard assumptions.
Score: 26.313415590777858
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We develop two new stochastic Gauss-Newton algorithms for solving a class of non-convex stochastic compositional optimization problems frequently arising in practice. We consider both the expectation and finite-sum settings under standard assumptions, and use both classical stochastic and SARAH estimators for approximating function values and Jacobians. In the expectation case, we establish $\mathcal{O}(\varepsilon^{-2})$ iteration-complexity to achieve a stationary point in expectation and estimate the total number of stochastic oracle calls for both function value and its Jacobian, where $\varepsilon$ is a desired accuracy. In the finite sum case, we also estimate $\mathcal{O}(\varepsilon^{-2})$ iteration-complexity and the total oracle calls with high probability. To our best knowledge, this is the first time such global stochastic oracle complexity is established for stochastic Gauss-Newton methods. Finally, we illustrate our theoretical results via two numerical examples on both synthetic and real datasets.

Related papers

Obtaining Lower Query Complexities through Lightweight Zeroth-Order Proximal Gradient Algorithms [65.42376001308064]
We propose two variance reduced ZO estimators for complex gradient problems. We improve the state-of-the-art function complexities from $mathcalOleft(minfracdn1/2epsilon2, fracdepsilon3right)$ to $tildecalOleft(fracdepsilon2right)$.
arXiv Detail & Related papers (2024-10-03T15:04:01Z)
Fully Zeroth-Order Bilevel Programming via Gaussian Smoothing [7.143879014059895]
We study and analyze zeroth-order approximation algorithms for solving bilvel problems. To the best of our knowledge, this is the first time that sample bounds are established for a fully zeroth-order bilevel optimization algorithm.
arXiv Detail & Related papers (2024-03-29T21:12:25Z)
Stochastic Halpern iteration in normed spaces and applications to reinforcement learning [0.30693357740321775]
We show that if the underlying oracle is uniformly bounded, our method exhibits an overall oracle complexity of $tildeO(varepsilon-5)$. We propose new synchronous algorithms for average reward and discounted reward Markov decision processes.
arXiv Detail & Related papers (2024-03-19T01:07:35Z)
Stochastic Optimization for Non-convex Problem with Inexact Hessian Matrix, Gradient, and Function [99.31457740916815]
Trust-region (TR) and adaptive regularization using cubics have proven to have some very appealing theoretical properties. We show that TR and ARC methods can simultaneously provide inexact computations of the Hessian, gradient, and function values.
arXiv Detail & Related papers (2023-10-18T10:29:58Z)
A Newton-CG based barrier-augmented Lagrangian method for general nonconvex conic optimization [53.044526424637866]
In this paper we consider finding an approximate second-order stationary point (SOSP) that minimizes a twice different subject general non conic optimization. In particular, we propose a Newton-CG based-augmentedconjugate method for finding an approximate SOSP.
arXiv Detail & Related papers (2023-01-10T20:43:29Z)
Stochastic Inexact Augmented Lagrangian Method for Nonconvex Expectation Constrained Optimization [88.0031283949404]
Many real-world problems have complicated non functional constraints and use a large number of data points. Our proposed method outperforms an existing method with the previously best-known result.
arXiv Detail & Related papers (2022-12-19T14:48:54Z)
Optimal Extragradient-Based Bilinearly-Coupled Saddle-Point Optimization [116.89941263390769]
We consider the smooth convex-concave bilinearly-coupled saddle-point problem, $min_mathbfxmax_mathbfyF(mathbfx) + H(mathbfx,mathbfy)$, where one has access to first-order oracles for $F$, $G$ as well as the bilinear coupling function $H$. We present a emphaccelerated gradient-extragradient (AG-EG) descent-ascent algorithm that combines extragrad
arXiv Detail & Related papers (2022-06-17T06:10:20Z)
Stochastic Zeroth order Descent with Structured Directions [10.604744518360464]
We introduce and analyze Structured Zeroth order Descent (SSZD), a finite difference approach that approximates a gradient on a set $lleq d directions, where $d is the dimension of the ambient space. For convex convex we prove almost sure convergence of functions on $O( (d/l) k-c1/2$)$ for every $c1/2$, which is arbitrarily close to the one of the Gradient Descent (SGD) in terms of one number of iterations.
arXiv Detail & Related papers (2022-06-10T14:00:06Z)
A Projection-free Algorithm for Constrained Stochastic Multi-level Composition Optimization [12.096252285460814]
We propose a projection-free conditional gradient-type algorithm for composition optimization. We show that the number of oracles and the linear-minimization oracle required by the proposed algorithm, are of order $mathcalO_T(epsilon-2)$ and $mathcalO_T(epsilon-3)$ respectively.
arXiv Detail & Related papers (2022-02-09T06:05:38Z)
Optimal Robust Linear Regression in Nearly Linear Time [97.11565882347772]
We study the problem of high-dimensional robust linear regression where a learner is given access to $n$ samples from the generative model $Y = langle X,w* rangle + epsilon$ We propose estimators for this problem under two settings: (i) $X$ is L4-L2 hypercontractive, $mathbbE [XXtop]$ has bounded condition number and $epsilon$ has bounded variance and (ii) $X$ is sub-Gaussian with identity second moment and $epsilon$ is
arXiv Detail & Related papers (2020-07-16T06:44:44Z)
Stochastic Recursive Variance Reduction for Efficient Smooth Non-Convex Compositional Optimization [20.410012564838933]
compositional optimization arises in many important machine learning tasks such as value function evaluation in reinforcement learning and portfolio management. In this paper, we investigate the general compositional optimization in the general smooth non-cursive setting.
arXiv Detail & Related papers (2019-12-31T18:59:13Z)

This list is automatically generated from the titles and abstracts of the papers in this site.