Related papers: Stochastic Recursive Variance Reduction for Efficient Smooth Non-Convex Compositional Optimization

Stochastic Recursive Variance Reduction for Efficient Smooth Non-Convex Compositional Optimization

URL: http://arxiv.org/abs/1912.13515v2
Date: Sat, 25 Jan 2020 10:52:11 GMT
Title: Stochastic Recursive Variance Reduction for Efficient Smooth Non-Convex Compositional Optimization
Authors: Huizhuo Yuan, Xiangru Lian, Ji Liu
Abstract summary: compositional optimization arises in many important machine learning tasks such as value function evaluation in reinforcement learning and portfolio management. In this paper, we investigate the general compositional optimization in the general smooth non-cursive setting.
Score: 20.410012564838933
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Stochastic compositional optimization arises in many important machine learning tasks such as value function evaluation in reinforcement learning and portfolio management. The objective function is the composition of two expectations of stochastic functions, and is more challenging to optimize than vanilla stochastic optimization problems. In this paper, we investigate the stochastic compositional optimization in the general smooth non-convex setting. We employ a recently developed idea of \textit{Stochastic Recursive Gradient Descent} to design a novel algorithm named SARAH-Compositional, and prove a sharp Incremental First-order Oracle (IFO) complexity upper bound for stochastic compositional optimization: $\mathcal{O}((n+m)^{1/2} \varepsilon^{-2})$ in the finite-sum case and $\mathcal{O}(\varepsilon^{-3})$ in the online case. Such a complexity is known to be the best one among IFO complexity results for non-convex stochastic compositional optimization, and is believed to be optimal. Our experiments validate the theoretical performance of our algorithm.

Related papers

An Enhanced Zeroth-Order Stochastic Frank-Wolfe Framework for Constrained Finite-Sum Optimization [15.652261277429968]
We propose an enhanced zeroth-order convex computation Frank-Wolfe to address constrained finite-sum optimization problems. Our method introduces a novel double variance reduction framework that effectively reduces the approximation induced by zeroth-order oracles.
arXiv Detail & Related papers (2025-01-13T10:53:19Z)
Stochastic Zeroth-Order Optimization under Strongly Convexity and Lipschitz Hessian: Minimax Sample Complexity [59.75300530380427]
We consider the problem of optimizing second-order smooth and strongly convex functions where the algorithm is only accessible to noisy evaluations of the objective function it queries. We provide the first tight characterization for the rate of the minimax simple regret by developing matching upper and lower bounds.
arXiv Detail & Related papers (2024-06-28T02:56:22Z)
SPABA: A Single-Loop and Probabilistic Stochastic Bilevel Algorithm Achieving Optimal Sample Complexity [5.046146134689904]
We show that there is no gap in complexity analysis between bilevel and single-level optimization when implementing SPABA. We propose several other bi-loop or nested bi-level algorithms to improve the state of complexity analysis.
arXiv Detail & Related papers (2024-05-29T05:36:03Z)
A Near-Optimal Single-Loop Stochastic Algorithm for Convex Finite-Sum Coupled Compositional Optimization [53.14532968909759]
We introduce an efficient single-loop primal-dual block-coordinate algorithm called ALEXR. We establish the convergence rates of ALEXR in both convex and strongly convex cases under smoothness and non-smoothness conditions. Experimental results on GDRO and partial Area Under the ROC Curve for cFCCO demonstrate the promising performance of our algorithm.
arXiv Detail & Related papers (2023-12-04T19:00:07Z)
Federated Conditional Stochastic Optimization [110.513884892319]
Conditional optimization has found in a wide range of machine learning tasks, such as in-variant learning tasks, AUPRC, andAML. This paper proposes algorithms for distributed federated learning.
arXiv Detail & Related papers (2023-10-04T01:47:37Z)
Riemannian Stochastic Gradient Method for Nested Composition Optimization [0.0]
This work considers optimization of composition of functions in a nested form over Riemannian where each function contains an expectation. This type of problems is gaining popularity in applications such as policy evaluation in reinforcement learning or model customization in metalearning.
arXiv Detail & Related papers (2022-07-19T15:58:27Z)
Momentum Accelerates the Convergence of Stochastic AUPRC Maximization [80.8226518642952]
We study optimization of areas under precision-recall curves (AUPRC), which is widely used for imbalanced tasks. We develop novel momentum methods with a better iteration of $O (1/epsilon4)$ for finding an $epsilon$stationary solution. We also design a novel family of adaptive methods with the same complexity of $O (1/epsilon4)$, which enjoy faster convergence in practice.
arXiv Detail & Related papers (2021-07-02T16:21:52Z)
Optimal Algorithms for Convex Nested Stochastic Composite Optimization [13.200502573462712]
convex nested composite optimization (NSCO) has received considerable attention for its applications in reinforcement learning and risk-averse optimization. The current NSCO algorithms have worse oracle complexities, by orders of magnitude, than those for simpler composite optimization problems without the nested structure. We develop order-optimal algorithms for the convex NSCO problem constructed from an arbitrary composition of smooth, structured non-smooth and general non-smooth layer functions.
arXiv Detail & Related papers (2020-11-19T19:22:58Z)
Bilevel Optimization: Convergence Analysis and Enhanced Design [63.64636047748605]
Bilevel optimization is a tool for many machine learning problems. We propose a novel stoc-efficientgradient estimator named stoc-BiO.
arXiv Detail & Related papers (2020-10-15T18:09:48Z)
Solving Stochastic Compositional Optimization is Nearly as Easy as Solving Stochastic Optimization [47.93365664380274]
This paper presents a newally Corrected Compositional gradient method (SCSC) SCSC runs in a single-time scale with a single loop, uses a fixed batch size, and guarantees to converge at the same rate as the gradient descent (SGD) method for non-compositional optimization.
arXiv Detail & Related papers (2020-08-25T06:54:00Z)
A Two-Timescale Framework for Bilevel Optimization: Complexity Analysis and Application to Actor-Critic [142.1492359556374]
Bilevel optimization is a class of problems which exhibit a two-level structure. We propose a two-timescale approximation (TTSA) algorithm for tackling such a bilevel problem. We show that a two-timescale natural actor-critic policy optimization algorithm can be viewed as a special case of our TTSA framework.
arXiv Detail & Related papers (2020-07-10T05:20:02Z)
Stochastic Gauss-Newton Algorithms for Nonconvex Compositional Optimization [26.313415590777858]
We develop two new Gauss-Newton algorithms for solving a class of non- compositional optimization problems. We consider both the expectation and finite-sum settings under standard assumptions.
arXiv Detail & Related papers (2020-02-17T22:56:45Z)

This list is automatically generated from the titles and abstracts of the papers in this site.