Streaming Linear System Identification with Reverse Experience Replay
- URL: http://arxiv.org/abs/2103.05896v1
- Date: Wed, 10 Mar 2021 06:51:55 GMT
- Title: Streaming Linear System Identification with Reverse Experience Replay
- Authors: Prateek Jain, Suhas S Kowshik, Dheeraj Nagaraj, Praneeth Netrapalli
- Abstract summary: We consider the problem of estimating a linear time-invariant (LTI) dynamical system from a single trajectory via streaming algorithms.
In many problems of interest as encountered in reinforcement learning (RL), it is important to estimate the parameters on the go using gradient oracle.
We propose a novel, SGD with Reverse Experience Replay (SGD-RER), that is inspired by the experience replay (ER) technique popular in the RL literature.
- Score: 45.17023170054112
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We consider the problem of estimating a stochastic linear time-invariant
(LTI) dynamical system from a single trajectory via streaming algorithms. The
problem is equivalent to estimating the parameters of vector auto-regressive
(VAR) models encountered in time series analysis (Hamilton (2020)). A recent
sequence of papers (Faradonbeh et al., 2018; Simchowitz et al., 2018; Sarkar
and Rakhlin, 2019) show that ordinary least squares (OLS) regression can be
used to provide optimal finite time estimator for the problem. However, such
techniques apply for offline setting where the optimal solution of OLS is
available apriori. But, in many problems of interest as encountered in
reinforcement learning (RL), it is important to estimate the parameters on the
go using gradient oracle. This task is challenging since standard methods like
SGD might not perform well when using stochastic gradients from correlated data
points (Gy\"orfi and Walk, 1996; Nagaraj et al., 2020).
In this work, we propose a novel algorithm, SGD with Reverse Experience
Replay (SGD-RER), that is inspired by the experience replay (ER) technique
popular in the RL literature (Lin, 1992). SGD-RER divides data into small
buffers and runs SGD backwards on the data stored in the individual buffers. We
show that this algorithm exactly deconstructs the dependency structure and
obtains information theoretically optimal guarantees for both parameter error
and prediction error for standard problem settings. Thus, we provide the first
- to the best of our knowledge - optimal SGD-style algorithm for the classical
problem of linear system identification aka VAR model estimation. Our work
demonstrates that knowledge of dependency structure can aid us in designing
algorithms which can deconstruct the dependencies between samples optimally in
an online fashion.
Related papers
- Accelerated zero-order SGD under high-order smoothness and overparameterized regime [79.85163929026146]
We present a novel gradient-free algorithm to solve convex optimization problems.
Such problems are encountered in medicine, physics, and machine learning.
We provide convergence guarantees for the proposed algorithm under both types of noise.
arXiv Detail & Related papers (2024-11-21T10:26:17Z) - Truncating Trajectories in Monte Carlo Policy Evaluation: an Adaptive Approach [51.76826149868971]
Policy evaluation via Monte Carlo simulation is at the core of many MC Reinforcement Learning (RL) algorithms.
We propose as a quality index a surrogate of the mean squared error of a return estimator that uses trajectories of different lengths.
We present an adaptive algorithm called Robust and Iterative Data collection strategy Optimization (RIDO)
arXiv Detail & Related papers (2024-10-17T11:47:56Z) - Statistical Learning and Inverse Problems: An Stochastic Gradient
Approach [0.0]
Inverse problems are paramount in Science and Engineering.
In this paper, we consider the setup of Statistical Inverse Problem (SIP) and demonstrate how Gradient Descent (SGD) algorithms can be used in the linear SIP setting.
arXiv Detail & Related papers (2022-09-29T17:42:01Z) - One-Pass Learning via Bridging Orthogonal Gradient Descent and Recursive
Least-Squares [8.443742714362521]
We develop an algorithm for one-pass learning which seeks to perfectly fit every new datapoint while changing the parameters in a direction that causes the least change to the predictions on previous datapoints.
Our algorithm uses the memory efficiently by exploiting the structure of the streaming data via an incremental principal component analysis (IPCA)
Our experiments show the effectiveness of the proposed method compared to the baselines.
arXiv Detail & Related papers (2022-07-28T02:01:31Z) - SreaMRAK a Streaming Multi-Resolution Adaptive Kernel Algorithm [60.61943386819384]
Existing implementations of KRR require that all the data is stored in the main memory.
We propose StreaMRAK - a streaming version of KRR.
We present a showcase study on two synthetic problems and the prediction of the trajectory of a double pendulum.
arXiv Detail & Related papers (2021-08-23T21:03:09Z) - Robust Regression Revisited: Acceleration and Improved Estimation Rates [25.54653340884806]
We study fast algorithms for statistical regression problems under the strong contamination model.
The goal is to approximately optimize a generalized linear model (GLM) given adversarially corrupted samples.
We present nearly-linear time algorithms for robust regression problems with improved runtime or estimation guarantees.
arXiv Detail & Related papers (2021-06-22T17:21:56Z) - A spectral algorithm for robust regression with subgaussian rates [0.0]
We study a new linear up to quadratic time algorithm for linear regression in the absence of strong assumptions on the underlying distributions of samples.
The goal is to design a procedure which attains the optimal sub-gaussian error bound even though the data have only finite moments.
arXiv Detail & Related papers (2020-07-12T19:33:50Z) - Fast OSCAR and OWL Regression via Safe Screening Rules [97.28167655721766]
Ordered $L_1$ (OWL) regularized regression is a new regression analysis for high-dimensional sparse learning.
Proximal gradient methods are used as standard approaches to solve OWL regression.
We propose the first safe screening rule for OWL regression by exploring the order of the primal solution with the unknown order structure.
arXiv Detail & Related papers (2020-06-29T23:35:53Z) - Least Squares Regression with Markovian Data: Fundamental Limits and
Algorithms [69.45237691598774]
We study the problem of least squares linear regression where the data-points are dependent and are sampled from a Markov chain.
We establish sharp information theoretic minimax lower bounds for this problem in terms of $tau_mathsfmix$.
We propose an algorithm based on experience replay--a popular reinforcement learning technique--that achieves a significantly better error rate.
arXiv Detail & Related papers (2020-06-16T04:26:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.