Fugu-MT 論文翻訳(概要): Streaming Linear System Identification with Reverse Experience Replay

論文の概要: Streaming Linear System Identification with Reverse Experience Replay

arxiv url: http://arxiv.org/abs/2103.05896v1
Date: Wed, 10 Mar 2021 06:51:55 GMT
ステータス: 翻訳完了
システム内更新日: 2021-03-11 15:00:59.296997
Title: Streaming Linear System Identification with Reverse Experience Replay
Title（参考訳）: リバースエクスペリエンスリプレイを用いたストリーミング線形システム同定
Authors: Prateek Jain, Suhas S Kowshik, Dheeraj Nagaraj, Praneeth Netrapalli
Abstract要約: 本稿では,線形時間不変(LTI)力学系を,ストリーミングアルゴリズムによる単一軌道から推定する問題を考察する。強化学習(RL)で遭遇する多くの問題において、勾配オラクルを用いて囲碁上のパラメータを推定することが重要である。本稿では,RL文学で人気のある経験リプレイ(ER)技術に触発された小説SGD with Reverse Experience Replay (SGD-RER)を提案する。
参考スコア（独自算出の注目度）: 45.17023170054112
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We consider the problem of estimating a stochastic linear time-invariant (LTI) dynamical system from a single trajectory via streaming algorithms. The problem is equivalent to estimating the parameters of vector auto-regressive (VAR) models encountered in time series analysis (Hamilton (2020)). A recent sequence of papers (Faradonbeh et al., 2018; Simchowitz et al., 2018; Sarkar and Rakhlin, 2019) show that ordinary least squares (OLS) regression can be used to provide optimal finite time estimator for the problem. However, such techniques apply for offline setting where the optimal solution of OLS is available apriori. But, in many problems of interest as encountered in reinforcement learning (RL), it is important to estimate the parameters on the go using gradient oracle. This task is challenging since standard methods like SGD might not perform well when using stochastic gradients from correlated data points (Gy\"orfi and Walk, 1996; Nagaraj et al., 2020). In this work, we propose a novel algorithm, SGD with Reverse Experience Replay (SGD-RER), that is inspired by the experience replay (ER) technique popular in the RL literature (Lin, 1992). SGD-RER divides data into small buffers and runs SGD backwards on the data stored in the individual buffers. We show that this algorithm exactly deconstructs the dependency structure and obtains information theoretically optimal guarantees for both parameter error and prediction error for standard problem settings. Thus, we provide the first - to the best of our knowledge - optimal SGD-style algorithm for the classical problem of linear system identification aka VAR model estimation. Our work demonstrates that knowledge of dependency structure can aid us in designing algorithms which can deconstruct the dependencies between samples optimally in an online fashion.
Abstract（参考訳）: ストリームアルゴリズムによる1つの軌道から確率的線形時間不変量(lti)力学系を推定する問題を考える。この問題は、時系列解析で遭遇するベクトル自己回帰(VAR)モデルのパラメータを推定することと同等である(Hamilton (2020))。最近の論文(Faradonbeh et al., 2018; Simchowitz et al., 2018; Sarkar and Rakhlin, 2019)では、通常の最小正方形(OLS)回帰を使用して、問題の最適な有限時間推定値を提供することができる。しかし、このような手法はolsの最適なソリューションが利用可能なオフライン設定に適用できる。しかし、強化学習(RL)で遭遇する多くの問題において、勾配オラクルを用いて囲碁上のパラメータを推定することが重要である。 Gy\orfi and Walk, 1996, Nagaraj et al., 2020) の相関データ点から確率勾配を用いる場合, SGD のような標準的な手法ではうまく機能しないため, この課題は困難である。本研究では、RL文学(Lin, 1992)で普及した経験再生(ER)技術にインスパイアされた新しいアルゴリズムであるSGD with Reverse Experience Replay(SGD-RER)を提案する。 SGD-RERはデータを小さなバッファに分割し、個々のバッファに格納されたデータに対してSGDを後方に実行する。このアルゴリズムは依存構造を正確に分解し、標準問題設定におけるパラメータ誤差と予測誤差の両方について理論的に最適な保証を得る。したがって、線形システム同定の古典的問題であるVARモデル推定に対して、私たちの知る限り、最適なSGDスタイルのアルゴリズムを初めて提供する。我々の研究は、オンライン手法でサンプル間の依存関係を最適に分解できるアルゴリズムの設計に、依存関係構造に関する知識が役立つことを示す。

論文の概要: Streaming Linear System Identification with Reverse Experience Replay

関連論文リスト