Optimal variance-reduced stochastic approximation in Banach spaces
- URL: http://arxiv.org/abs/2201.08518v1
- Date: Fri, 21 Jan 2022 02:46:57 GMT
- Title: Optimal variance-reduced stochastic approximation in Banach spaces
- Authors: Wenlong Mou, Koulik Khamaru, Martin J. Wainwright, Peter L. Bartlett,
Michael I. Jordan
- Abstract summary: We study the problem of estimating the fixed point of a contractive operator defined on a separable Banach space.
We establish non-asymptotic bounds for both the operator defect and the estimation error.
- Score: 114.8734960258221
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We study the problem of estimating the fixed point of a contractive operator
defined on a separable Banach space. Focusing on a stochastic query model that
provides noisy evaluations of the operator, we analyze a variance-reduced
stochastic approximation scheme, and establish non-asymptotic bounds for both
the operator defect and the estimation error, measured in an arbitrary
semi-norm. In contrast to worst-case guarantees, our bounds are
instance-dependent, and achieve the local asymptotic minimax risk
non-asymptotically. For linear operators, contractivity can be relaxed to
multi-step contractivity, so that the theory can be applied to problems like
average reward policy evaluation problem in reinforcement learning. We
illustrate the theory via applications to stochastic shortest path problems,
two-player zero-sum Markov games, as well as policy evaluation and $Q$-learning
for tabular Markov decision processes.
Related papers
- Trust-Region Sequential Quadratic Programming for Stochastic Optimization with Random Models [57.52124921268249]
We propose a Trust Sequential Quadratic Programming method to find both first and second-order stationary points.
To converge to first-order stationary points, our method computes a gradient step in each iteration defined by minimizing a approximation of the objective subject.
To converge to second-order stationary points, our method additionally computes an eigen step to explore the negative curvature the reduced Hessian matrix.
arXiv Detail & Related papers (2024-09-24T04:39:47Z) - A Unified Theory of Stochastic Proximal Point Methods without Smoothness [52.30944052987393]
Proximal point methods have attracted considerable interest owing to their numerical stability and robustness against imperfect tuning.
This paper presents a comprehensive analysis of a broad range of variations of the proximal point method (SPPM)
arXiv Detail & Related papers (2024-05-24T21:09:19Z) - Model-Based Epistemic Variance of Values for Risk-Aware Policy Optimization [59.758009422067]
We consider the problem of quantifying uncertainty over expected cumulative rewards in model-based reinforcement learning.
We propose a new uncertainty Bellman equation (UBE) whose solution converges to the true posterior variance over values.
We introduce a general-purpose policy optimization algorithm, Q-Uncertainty Soft Actor-Critic (QU-SAC) that can be applied for either risk-seeking or risk-averse policy optimization.
arXiv Detail & Related papers (2023-12-07T15:55:58Z) - A Local Convergence Theory for the Stochastic Gradient Descent Method in
Non-Convex Optimization With Non-isolated Local Minima [0.0]
Non-isolated minima presents a unique challenge that has remained under-explored.
In this paper, we study the local convergence of the gradient descent method to non-isolated global minima.
arXiv Detail & Related papers (2022-03-21T13:33:37Z) - Optimal oracle inequalities for solving projected fixed-point equations [53.31620399640334]
We study methods that use a collection of random observations to compute approximate solutions by searching over a known low-dimensional subspace of the Hilbert space.
We show how our results precisely characterize the error of a class of temporal difference learning methods for the policy evaluation problem with linear function approximation.
arXiv Detail & Related papers (2020-12-09T20:19:32Z) - Convergence Properties of Stochastic Hypergradients [38.64355126221992]
We study approximation schemes for the hypergradient, which are important when the lower-level problem is empirical risk on a large dataset.
We provide numerical experiments to support our theoretical analysis and to show the advantage of using hypergradients in practice.
arXiv Detail & Related papers (2020-11-13T20:50:36Z) - Sparse recovery by reduced variance stochastic approximation [5.672132510411465]
We discuss application of iterative quadratic optimization routines to the problem of sparse signal recovery from noisy observation.
We show how one can straightforwardly enhance reliability of the corresponding solution by using Median-of-Means like techniques.
arXiv Detail & Related papers (2020-06-11T12:31:20Z) - Is Temporal Difference Learning Optimal? An Instance-Dependent Analysis [102.29671176698373]
We address the problem of policy evaluation in discounted decision processes, and provide Markov-dependent guarantees on the $ell_infty$error under a generative model.
We establish both and non-asymptotic versions of local minimax lower bounds for policy evaluation, thereby providing an instance-dependent baseline by which to compare algorithms.
arXiv Detail & Related papers (2020-03-16T17:15:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.