Fugu-MT 論文翻訳(概要): Layered State Discovery for Incremental Autonomous Exploration

論文の概要: Layered State Discovery for Incremental Autonomous Exploration

arxiv url: http://arxiv.org/abs/2302.03789v1
Date: Tue, 7 Feb 2023 22:58:12 GMT
ステータス: 翻訳完了
システム内更新日: 2023-02-09 17:49:49.211003
Title: Layered State Discovery for Incremental Autonomous Exploration
Title（参考訳）: インクリメンタル自律探査のための層状状態発見
Authors: Liyu Chen, Andrea Tirinzoni, Alessandro Lazaric, Matteo Pirotta
Abstract要約: Layered Autonomous Exploration (LAE) は、$tildemathcalO(LSrightarrow_LAln12(Srightarrow_LAln12(Srightarrow_LAln12(Srightarrow_LAln12(Srightar row_LAln12)Srightarrow_LAln12(Srightarrow_LAln12)Srightarrow_LAln12(Srightarrow_LAln12)のサンプル複雑性を達成するAXの新しいアルゴリズムである。
参考スコア（独自算出の注目度）: 106.37656068276901
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We study the autonomous exploration (AX) problem proposed by Lim & Auer (2012). In this setting, the objective is to discover a set of $\epsilon$-optimal policies reaching a set $\mathcal{S}_L^{\rightarrow}$ of incrementally $L$-controllable states. We introduce a novel layered decomposition of the set of incrementally $L$-controllable states that is based on the iterative application of a state-expansion operator. We leverage these results to design Layered Autonomous Exploration (LAE), a novel algorithm for AX that attains a sample complexity of $\tilde{\mathcal{O}}(LS^{\rightarrow}_{L(1+\epsilon)}\Gamma_{L(1+\epsilon)} A \ln^{12}(S^{\rightarrow}_{L(1+\epsilon)})/\epsilon^2)$, where $S^{\rightarrow}_{L(1+\epsilon)}$ is the number of states that are incrementally $L(1+\epsilon)$-controllable, $A$ is the number of actions, and $\Gamma_{L(1+\epsilon)}$ is the branching factor of the transitions over such states. LAE improves over the algorithm of Tarbouriech et al. (2020a) by a factor of $L^2$ and it is the first algorithm for AX that works in a countably-infinite state space. Moreover, we show that, under a certain identifiability assumption, LAE achieves minimax-optimal sample complexity of $\tilde{\mathcal{O}}(LS^{\rightarrow}_{L}A\ln^{12}(S^{\rightarrow}_{L})/\epsilon^2)$, outperforming existing algorithms and matching for the first time the lower bound proved by Cai et al. (2022) up to logarithmic factors.
Abstract（参考訳）: lim & auer (2012) が提案した自律探査 (ax) 問題について検討した。この設定では、$\epsilon$-Optimal Policy がセット $\mathcal{S}_L^{\rightarrow}$ に到達し、段階的に$L$-制御可能な状態を見つけることが目的である。本稿では,状態拡張演算子の反復的適用に基づく,漸進的に$L$制御可能な状態集合の階層分解を導入する。 We leverage these results to design Layered Autonomous Exploration (LAE), a novel algorithm for AX that attains a sample complexity of $\tilde{\mathcal{O}}(LS^{\rightarrow}_{L(1+\epsilon)}\Gamma_{L(1+\epsilon)} A \ln^{12}(S^{\rightarrow}_{L(1+\epsilon)})/\epsilon^2)$, where $S^{\rightarrow}_{L(1+\epsilon)}$ is the number of states that are incrementally $L(1+\epsilon)$-controllable, $A$ is the number of actions, and $\Gamma_{L(1+\epsilon)}$ is the branching factor of the transitions over such states. LAEはTarbouriech et al. (2020a)のアルゴリズムを$L^2$の係数で改善し、数え切れない無限の状態空間で動作するAXの最初のアルゴリズムである。さらに、ある識別可能性仮定の下で、LAE は $\tilde{\mathcal{O}}(LS^{\rightarrow}_{L}A\ln^{12}(S^{\rightarrow}_{L})/\epsilon^2)$ の最小値-最適サンプル複雑性を達成し、既存のアルゴリズムを上回り、Cai et al. (2022) によって証明された下界が対数因子まで初めて一致することを示す。

論文の概要: Layered State Discovery for Incremental Autonomous Exploration

関連論文リスト