Stochastic Halpern iteration in normed spaces and applications to reinforcement learning
- URL: http://arxiv.org/abs/2403.12338v3
- Date: Fri, 08 Nov 2024 20:18:33 GMT
- Title: Stochastic Halpern iteration in normed spaces and applications to reinforcement learning
- Authors: Mario Bravo, Juan Pablo Contreras,
- Abstract summary: We show that if the underlying oracle is uniformly bounded, our method exhibits an overall oracle complexity of $tildeO(varepsilon-5)$.
We propose new synchronous algorithms for average reward and discounted reward Markov decision processes.
- Score: 0.30693357740321775
- License:
- Abstract: We analyze the oracle complexity of the stochastic Halpern iteration with variance reduction, where we aim to approximate fixed-points of nonexpansive and contractive operators in a normed finite-dimensional space. We show that if the underlying stochastic oracle is with uniformly bounded variance, our method exhibits an overall oracle complexity of $\tilde{O}(\varepsilon^{-5})$, improving recent rates established for the stochastic Krasnoselskii-Mann iteration. Also, we establish a lower bound of $\Omega(\varepsilon^{-3})$, which applies to a wide range of algorithms, including all averaged iterations even with minibatching. Using a suitable modification of our approach, we derive a $O(\varepsilon^{-2}(1-\gamma)^{-3})$ complexity bound in the case in which the operator is a $\gamma$-contraction. As an application, we propose new synchronous algorithms for average reward and discounted reward Markov decision processes. In particular, for the average reward, our method improves on the best-known sample complexity.
Related papers
- Quantum Algorithms for Non-smooth Non-convex Optimization [30.576546266390714]
This paper considers the problem for finding the $(,epsilon)$-Goldstein stationary point of Lipschitz continuous objective.
We construct a zeroth-order quantum estimator for the surrogate oracle function.
arXiv Detail & Related papers (2024-10-21T16:52:26Z) - Accelerated Stochastic Min-Max Optimization Based on Bias-corrected Momentum [30.01198677588252]
First-order algorithms require at least $mathcalO(varepsilonepsilon-4)$ complexity to find an $varepsilon-stationary point.
We introduce novel momentum algorithms utilizing efficient variable complexity.
The effectiveness of the method is validated through robust logistic regression using real-world datasets.
arXiv Detail & Related papers (2024-06-18T20:14:52Z) - Accelerated Variance-Reduced Forward-Reflected Methods for Root-Finding Problems [8.0153031008486]
We propose a novel class of Nesterov's accelerated forward-reflected-based methods with variance reduction to solve root-finding problems.
Our algorithm is single-loop and leverages a new family of unbiased variance-reduced estimators specifically designed for root-finding problems.
arXiv Detail & Related papers (2024-06-04T15:23:29Z) - Stochastic Inexact Augmented Lagrangian Method for Nonconvex Expectation
Constrained Optimization [88.0031283949404]
Many real-world problems have complicated non functional constraints and use a large number of data points.
Our proposed method outperforms an existing method with the previously best-known result.
arXiv Detail & Related papers (2022-12-19T14:48:54Z) - Adaptive Stochastic Variance Reduction for Non-convex Finite-Sum
Minimization [52.25843977506935]
We propose an adaptive variance method, called AdaSpider, for $L$-smooth, non-reduction functions with a finitesum structure.
In doing so, we are able to compute an $epsilon-stationary point with $tildeOleft + st/epsilon calls.
arXiv Detail & Related papers (2022-11-03T14:41:46Z) - Explicit Second-Order Min-Max Optimization Methods with Optimal Convergence Guarantee [86.05440220344755]
We propose and analyze inexact regularized Newton-type methods for finding a global saddle point of emphcon unconstrained min-max optimization problems.
We show that the proposed methods generate iterates that remain within a bounded set and that the iterations converge to an $epsilon$-saddle point within $O(epsilon-2/3)$ in terms of a restricted function.
arXiv Detail & Related papers (2022-10-23T21:24:37Z) - Multi-block-Single-probe Variance Reduced Estimator for Coupled
Compositional Optimization [49.58290066287418]
We propose a novel method named Multi-block-probe Variance Reduced (MSVR) to alleviate the complexity of compositional problems.
Our results improve upon prior ones in several aspects, including the order of sample complexities and dependence on strongity.
arXiv Detail & Related papers (2022-07-18T12:03:26Z) - Hessian Averaging in Stochastic Newton Methods Achieves Superlinear
Convergence [69.65563161962245]
We consider a smooth and strongly convex objective function using a Newton method.
We show that there exists a universal weighted averaging scheme that transitions to local convergence at an optimal stage.
arXiv Detail & Related papers (2022-04-20T07:14:21Z) - A Projection-free Algorithm for Constrained Stochastic Multi-level
Composition Optimization [12.096252285460814]
We propose a projection-free conditional gradient-type algorithm for composition optimization.
We show that the number of oracles and the linear-minimization oracle required by the proposed algorithm, are of order $mathcalO_T(epsilon-2)$ and $mathcalO_T(epsilon-3)$ respectively.
arXiv Detail & Related papers (2022-02-09T06:05:38Z) - Second-Order Information in Non-Convex Stochastic Optimization: Power
and Limitations [54.42518331209581]
We find an algorithm which finds.
epsilon$-approximate stationary point (with $|nabla F(x)|le epsilon$) using.
$(epsilon,gamma)$surimate random random points.
Our lower bounds here are novel even in the noiseless case.
arXiv Detail & Related papers (2020-06-24T04:41:43Z) - Stochastic Gauss-Newton Algorithms for Nonconvex Compositional
Optimization [26.313415590777858]
We develop two new Gauss-Newton algorithms for solving a class of non- compositional optimization problems.
We consider both the expectation and finite-sum settings under standard assumptions.
arXiv Detail & Related papers (2020-02-17T22:56:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.