On the Stability of Random Matrix Product with Markovian Noise:
Application to Linear Stochastic Approximation and TD Learning
- URL: http://arxiv.org/abs/2102.00185v1
- Date: Sat, 30 Jan 2021 08:39:38 GMT
- Title: On the Stability of Random Matrix Product with Markovian Noise:
Application to Linear Stochastic Approximation and TD Learning
- Authors: Alain Durmus, Eric Moulines, Alexey Naumov, Sergey Samsonov, Hoi-To
Wai
- Abstract summary: This paper studies the exponential stability of random matrix products driven by a general (possibly) state space Markov chain.
It is a cornerstone in the analysis of algorithms in machine learning.
- Score: 33.24726879325671
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper studies the exponential stability of random matrix products driven
by a general (possibly unbounded) state space Markov chain. It is a cornerstone
in the analysis of stochastic algorithms in machine learning (e.g. for
parameter tracking in online learning or reinforcement learning). The existing
results impose strong conditions such as uniform boundedness of the
matrix-valued functions and uniform ergodicity of the Markov chains. Our main
contribution is an exponential stability result for the $p$-th moment of random
matrix product, provided that (i) the underlying Markov chain satisfies a
super-Lyapunov drift condition, (ii) the growth of the matrix-valued functions
is controlled by an appropriately defined function (related to the drift
condition). Using this result, we give finite-time $p$-th moment bounds for
constant and decreasing stepsize linear stochastic approximation schemes with
Markovian noise on general state space. We illustrate these findings for linear
value-function estimation in reinforcement learning. We provide finite-time
$p$-th moment bound for various members of temporal difference (TD) family of
algorithms.
Related papers
- Statistical Inference for Temporal Difference Learning with Linear Function Approximation [62.69448336714418]
Temporal Difference (TD) learning, arguably the most widely used for policy evaluation, serves as a natural framework for this purpose.
In this paper, we study the consistency properties of TD learning with Polyak-Ruppert averaging and linear function approximation, and obtain three significant improvements over existing results.
arXiv Detail & Related papers (2024-10-21T15:34:44Z) - Learning Unstable Continuous-Time Stochastic Linear Control Systems [0.0]
We study the problem of system identification for continuous-time dynamics, based on a single finite-length state trajectory.
We present a method for estimating the possibly unstable open-loop matrix by employing properly randomized control inputs.
We establish theoretical performance guarantees showing that the estimation error decays with trajectory length, a measure of excitability, and the signal-to-noise ratio.
arXiv Detail & Related papers (2024-09-17T16:24:51Z) - The ODE Method for Stochastic Approximation and Reinforcement Learning with Markovian Noise [17.493808856903303]
One fundamental challenge in analyzing an approximation algorithm is to establish its stability.
In this paper, we extend the celebrated Borkar-Meyn theorem for stability bounded from the Martingale difference noise setting to the Markovian noise setting.
Central to our analysis is the diminishing rate of change of a few functions, which is implied by both a form of strong law of large numbers and a commonly used V4 Lynov drift condition.
arXiv Detail & Related papers (2024-01-15T17:20:17Z) - Stochastic-Constrained Stochastic Optimization with Markovian Data [2.1756081703276]
We study the setting where data samples are drawn from a Markov chain and thus are not independent and identically distributed.
We propose two variants of drift-plus-penalty; one is for the case when the mixing time of the underlying Markov chain is known.
Our algorithms apply to a more general setting of constrained online convex optimization where the sequence of constraint functions follows a Markov chain.
arXiv Detail & Related papers (2023-12-07T14:09:27Z) - Improved High-Probability Bounds for the Temporal Difference Learning Algorithm via Exponential Stability [17.771354881467435]
We show that a simple algorithm with a universal and instance-independent step size is sufficient to obtain near-optimal variance and bias terms.
Our proof technique is based on refined error bounds for linear approximation together with the novel stability result for the product of random matrices.
arXiv Detail & Related papers (2023-10-22T12:37:25Z) - Stochastic Inexact Augmented Lagrangian Method for Nonconvex Expectation
Constrained Optimization [88.0031283949404]
Many real-world problems have complicated non functional constraints and use a large number of data points.
Our proposed method outperforms an existing method with the previously best-known result.
arXiv Detail & Related papers (2022-12-19T14:48:54Z) - Formal Controller Synthesis for Markov Jump Linear Systems with
Uncertain Dynamics [64.72260320446158]
We propose a method for synthesising controllers for Markov jump linear systems.
Our method is based on a finite-state abstraction that captures both the discrete (mode-jumping) and continuous (stochastic linear) behaviour of the MJLS.
We apply our method to multiple realistic benchmark problems, in particular, a temperature control and an aerial vehicle delivery problem.
arXiv Detail & Related papers (2022-12-01T17:36:30Z) - Numerically Stable Sparse Gaussian Processes via Minimum Separation
using Cover Trees [57.67528738886731]
We study the numerical stability of scalable sparse approximations based on inducing points.
For low-dimensional tasks such as geospatial modeling, we propose an automated method for computing inducing points satisfying these conditions.
arXiv Detail & Related papers (2022-10-14T15:20:17Z) - Non-Markovian Stochastic Schr\"odinger Equation: Matrix Product State
Approach to the Hierarchy of Pure States [65.25197248984445]
We derive a hierarchy of matrix product states (HOMPS) for non-Markovian dynamics in open finite temperature.
The validity and efficiency of HOMPS is demonstrated for the spin-boson model and long chains where each site is coupled to a structured, strongly non-Markovian environment.
arXiv Detail & Related papers (2021-09-14T01:47:30Z) - A Matrix Chernoff Bound for Markov Chains and Its Application to
Co-occurrence Matrices [30.49132891333353]
We prove a Chernoff-type bound for sums of matrix-valued random variables sampled via a regular (aperiodic and irreducible) finite Markov chain.
Our proof is based on the matrix expander (regular undirected graph) Chernoff bound [Garg et al. STOC '18] and scalar Chernoff-Hoeffding bounds for Markov chains.
Our matrix Chernoff bound for Markov chains can be applied to analyze the behavior of co-occurrence statistics for sequential data.
arXiv Detail & Related papers (2020-08-06T05:44:54Z) - On Linear Stochastic Approximation: Fine-grained Polyak-Ruppert and
Non-Asymptotic Concentration [115.1954841020189]
We study the inequality and non-asymptotic properties of approximation procedures with Polyak-Ruppert averaging.
We prove a central limit theorem (CLT) for the averaged iterates with fixed step size and number of iterations going to infinity.
arXiv Detail & Related papers (2020-04-09T17:54:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.