Fugu-MT 論文翻訳(概要): Impact of Connectivity on Laplacian Representations in Reinforcement Learning

論文の概要: Impact of Connectivity on Laplacian Representations in Reinforcement Learning

arxiv url: http://arxiv.org/abs/2603.08558v1
Date: Mon, 09 Mar 2026 16:20:31 GMT
ステータス: 翻訳完了
システム内更新日: 2026-03-10 15:13:16.406369
Title: Impact of Connectivity on Laplacian Representations in Reinforcement Learning
Title（参考訳）: 強化学習における接続性がラプラス表現に及ぼす影響
Authors: Tommaso Giorgi, Pierriccardo Olivieri, Keyue Jiang, Laura Toni, Matteo Papini,
Abstract要約: 本研究では, 線形値関数近似の近似誤差について, 学習スペクトル条件下での上限値を示す。固有ベクトル推定自体によってもたらされる誤差をさらに制限し、エンドツーエンドのエラー分解に繋がる。我々の結果は、誘導された遷移核の対称性を仮定せずに一般的な(一様でない)ポリシーを保っている。
参考スコア（独自算出の注目度）: 9.306521175972588
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Learning compact state representations in Markov Decision Processes (MDPs) has proven crucial for addressing the curse of dimensionality in large-scale reinforcement learning (RL) problems. Existing principled approaches leverage structural priors on the MDP by constructing state representations as linear combinations of the state-graph Laplacian eigenvectors. When the transition graph is unknown or the state space is prohibitively large, the graph spectral features can be estimated directly via sample trajectories. In this work, we prove an upper bound on the approximation error of linear value function approximation under the learned spectral features. We show how this error scales with the algebraic connectivity of the state-graph, grounding the approximation quality in the topological structure of the MDP. We further bound the error introduced by the eigenvector estimation itself, leading to an end-to-end error decomposition across the representation learning pipeline. Additionally, our expression of the Laplacian operator for the RL setting, although equivalent to existing ones, prevents some common misunderstandings, of which we show some examples from the literature. Our results hold for general (non-uniform) policies without any assumptions on the symmetry of the induced transition kernel. We validate our theoretical findings with numerical simulations on gridworld environments.
Abstract（参考訳）: マルコフ決定過程(MDP)におけるコンパクトな状態表現の学習は、大規模強化学習(RL)問題における次元性の呪いに対処するために重要であることが証明されている。既存の原理化されたアプローチは、状態グラフラプラシアン固有ベクトルの線形結合として状態表現を構成することで、MDP上の構造的前提を利用する。遷移グラフが未知であるか、あるいは状態空間が禁止的に大きい場合、グラフスペクトルの特徴はサンプル軌跡から直接推定することができる。本研究では, 線形値関数近似の近似誤差の上限を, 学習スペクトル特性の下で証明する。この誤差が状態グラフの代数的接続とどのようにスケールするかを示し、MDPの位相構造における近似品質を基礎とする。さらに固有ベクトル推定自体がもたらした誤差を拘束し、表現学習パイプライン全体にわたってエンドツーエンドのエラー分解を行う。さらに、RL 設定に対するラプラシアン作用素の表現は、既存のものと同値であるが、いくつかの一般的な誤解を防ぎ、文献からいくつかの例を示す。我々の結果は、誘導された遷移核の対称性を仮定せずに一般的な(一様でない)ポリシーを保っている。グリッドワールド環境における数値シミュレーションによる理論的知見の検証を行った。

論文の概要: Impact of Connectivity on Laplacian Representations in Reinforcement Learning

関連論文リスト