Fugu-MT 論文翻訳(概要): Cost-Driven Representation Learning for Linear Quadratic Gaussian Control: Part II

論文の概要: Cost-Driven Representation Learning for Linear Quadratic Gaussian Control: Part II

arxiv url: http://arxiv.org/abs/2603.07437v1
Date: Sun, 08 Mar 2026 03:20:52 GMT
ステータス: 翻訳完了
システム内更新日: 2026-03-10 15:13:14.590413
Title: Cost-Driven Representation Learning for Linear Quadratic Gaussian Control: Part II
Title（参考訳）: 線形二次ガウス制御のためのコスト駆動型表現学習(その2)
Authors: Yi Tian, Kaiqing Zhang, Russ Tedrake, Suvrit Sra,
Abstract要約: 部分的および潜在的に高次元観測から制御する状態表現学習の課題について検討する。我々は、コスト駆動型状態表現学習を通じてこの問題にアプローチし、累積コストを予測して潜在状態空間の動的モデルを学習する。
参考スコア（独自算出の注目度）: 57.29427648134142
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We study the problem of state representation learning for control from partial and potentially high-dimensional observations. We approach this problem via cost-driven state representation learning, in which we learn a dynamical model in a latent state space by predicting cumulative costs. In particular, we establish finite-sample guarantees on finding a near-optimal representation function and a near-optimal controller using the learned latent model for infinite-horizon time-invariant Linear Quadratic Gaussian (LQG) control. We study two approaches to cost-driven representation learning, which differ in whether the transition function of the latent state is learned explicitly or implicitly. The first approach has also been investigated in Part I of this work, for finite-horizon time-varying LQG control. The second approach closely resembles MuZero, a recent breakthrough in empirical reinforcement learning, in that it learns latent dynamics implicitly by predicting cumulative costs. A key technical contribution of this Part II is to prove persistency of excitation for a new stochastic process that arises from the analysis of quadratic regression in our approach, and may be of independent interest.
Abstract（参考訳）: 部分的および潜在的に高次元観測から制御する状態表現学習の課題について検討する。我々は、コスト駆動型状態表現学習を通じてこの問題にアプローチし、累積コストを予測して潜在状態空間の動的モデルを学習する。特に、無限水平時間不変な線形二次ガウス(LQG)制御のための学習潜在モデルを用いて、準最適表現関数と準最適コントローラを求めるための有限サンプル保証を確立する。本研究では,遅延状態の遷移関数が明示的に学習されるか,暗黙的に学習されるかという,コスト駆動型表現学習の2つのアプローチについて検討する。この研究の第1部では、有限水平時間変化LQG制御のための最初のアプローチも検討されている。第二のアプローチは、最近の実証的強化学習のブレークスルーであるMuZeroによく似ており、累積コストを予測することによって潜伏ダイナミクスを暗黙的に学習する。このパートIIの重要な技術的貢献は、我々のアプローチにおける二次回帰の分析から生じる新しい確率過程に対する励起の持続性を証明することである。

論文の概要: Cost-Driven Representation Learning for Linear Quadratic Gaussian Control: Part II

関連論文リスト