Online Statistical Inference for Nonlinear Stochastic Approximation with
Markovian Data
- URL: http://arxiv.org/abs/2302.07690v1
- Date: Wed, 15 Feb 2023 14:31:11 GMT
- Title: Online Statistical Inference for Nonlinear Stochastic Approximation with
Markovian Data
- Authors: Xiang Li, Jiadong Liang, Zhihua Zhang
- Abstract summary: We study the statistical inference of nonlinear approximation algorithms utilizing a single trajectory of Markovian data.
Our methodology has practical applications in various scenarios, such as Gradient Descent (SGD) on autoregressive data and asynchronous Q-Learning.
- Score: 22.59079286063505
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We study the statistical inference of nonlinear stochastic approximation
algorithms utilizing a single trajectory of Markovian data. Our methodology has
practical applications in various scenarios, such as Stochastic Gradient
Descent (SGD) on autoregressive data and asynchronous Q-Learning. By utilizing
the standard stochastic approximation (SA) framework to estimate the target
parameter, we establish a functional central limit theorem for its partial-sum
process, $\boldsymbol{\phi}_T$. To further support this theory, we provide a
matching semiparametric efficient lower bound and a non-asymptotic upper bound
on its weak convergence, measured in the L\'evy-Prokhorov metric. This
functional central limit theorem forms the basis for our inference method. By
selecting any continuous scale-invariant functional $f$, the asymptotic pivotal
statistic $f(\boldsymbol{\phi}_T)$ becomes accessible, allowing us to construct
an asymptotically valid confidence interval. We analyze the rejection
probability of a family of functionals $f_m$, indexed by $m \in \mathbb{N}$,
through theoretical and numerical means. The simulation results demonstrate the
validity and efficiency of our method.
Related papers
- Statistical Inference for Temporal Difference Learning with Linear Function Approximation [62.69448336714418]
Temporal Difference (TD) learning, arguably the most widely used for policy evaluation, serves as a natural framework for this purpose.
In this paper, we study the consistency properties of TD learning with Polyak-Ruppert averaging and linear function approximation, and obtain three significant improvements over existing results.
arXiv Detail & Related papers (2024-10-21T15:34:44Z) - Tractable and Provably Efficient Distributional Reinforcement Learning with General Value Function Approximation [8.378137704007038]
We present a regret analysis for distributional reinforcement learning with general value function approximation.
Our theoretical results show that approximating the infinite-dimensional return distribution with a finite number of moment functionals is the only method to learn the statistical information unbiasedly.
arXiv Detail & Related papers (2024-07-31T00:43:51Z) - Online non-parametric likelihood-ratio estimation by Pearson-divergence
functional minimization [55.98760097296213]
We introduce a new framework for online non-parametric LRE (OLRE) for the setting where pairs of iid observations $(x_t sim p, x'_t sim q)$ are observed over time.
We provide theoretical guarantees for the performance of the OLRE method along with empirical validation in synthetic experiments.
arXiv Detail & Related papers (2023-11-03T13:20:11Z) - Kernel-based off-policy estimation without overlap: Instance optimality
beyond semiparametric efficiency [53.90687548731265]
We study optimal procedures for estimating a linear functional based on observational data.
For any convex and symmetric function class $mathcalF$, we derive a non-asymptotic local minimax bound on the mean-squared error.
arXiv Detail & Related papers (2023-01-16T02:57:37Z) - Stochastic Inexact Augmented Lagrangian Method for Nonconvex Expectation
Constrained Optimization [88.0031283949404]
Many real-world problems have complicated non functional constraints and use a large number of data points.
Our proposed method outperforms an existing method with the previously best-known result.
arXiv Detail & Related papers (2022-12-19T14:48:54Z) - Statistical Efficiency of Score Matching: The View from Isoperimetry [96.65637602827942]
We show a tight connection between statistical efficiency of score matching and the isoperimetric properties of the distribution being estimated.
We formalize these results both in the sample regime and in the finite regime.
arXiv Detail & Related papers (2022-10-03T06:09:01Z) - Fast and Robust Online Inference with Stochastic Gradient Descent via
Random Scaling [0.9806910643086042]
We develop a new method of online inference for a vector of parameters estimated by the Polyak-Rtupper averaging procedure of gradient descent algorithms.
Our approach is fully operational with online data and is rigorously underpinned by a functional central limit theorem.
arXiv Detail & Related papers (2021-06-06T15:38:37Z) - SLEIPNIR: Deterministic and Provably Accurate Feature Expansion for
Gaussian Process Regression with Derivatives [86.01677297601624]
We propose a novel approach for scaling GP regression with derivatives based on quadrature Fourier features.
We prove deterministic, non-asymptotic and exponentially fast decaying error bounds which apply for both the approximated kernel as well as the approximated posterior.
arXiv Detail & Related papers (2020-03-05T14:33:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.