Fugu-MT 論文翻訳(概要): Adversarial Imitation Learning with General Function Approximation: Theoretical Analysis and Practical Algorithms

論文の概要: Adversarial Imitation Learning with General Function Approximation: Theoretical Analysis and Practical Algorithms

arxiv url: http://arxiv.org/abs/2605.01778v1
Date: Sun, 03 May 2026 08:31:24 GMT
ステータス: 翻訳完了
システム内更新日: 2026-05-05 20:33:49.935345
Title: Adversarial Imitation Learning with General Function Approximation: Theoretical Analysis and Practical Algorithms
Title（参考訳）: 一般関数近似を用いた対数模倣学習:理論的解析と実践的アルゴリズム
Authors: Tian Xu, Zhilong Zhang, Zexuan Chen, Ruishuo Chen, Yihao Sun, Yang Yu,
Abstract要約: 我々は最適化ベースのAIL(OPT-AIL)と呼ばれる新しいフレームワークを紹介する。 OPT-AILは報酬学習のためのオンライン最適化と政策学習のための最適化を行う。我々の知る限り、OPT-AILは一般関数近似における最初の証明可能なAIL法である。
参考スコア（独自算出の注目度）: 20.16205018738796
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Adversarial imitation learning (AIL), a prominent approach in imitation learning, has achieved significant practical success powered by neural network approximation. However, existing theoretical analyses of AIL are primarily confined to simplified settings, such as tabular and linear function approximation, and involve complex algorithmic designs that impede practical implementation. This creates a substantial gap between theory and practice. This paper bridges this gap by exploring the theoretical underpinnings of online AIL with general function approximation. We introduce a novel framework called optimization-based AIL (OPT-AIL), which performs online optimization for reward learning coupled with optimism-regularized optimization for policy learning. Within this framework, we develop two concrete methods: model-free OPT-AIL and model-based OPT-AIL. Our theoretical analysis demonstrates that both variants achieve polynomial expert sample complexity and interaction complexity for learning near-expert policies. To the best of our knowledge, they represent the first provably efficient AIL methods under general function approximation. From a practical standpoint, OPT-AIL requires only the approximate optimization of two objectives, thereby facilitating practical implementation. Empirical studies demonstrate that OPT-AIL outperforms previous state-of-the-art deep AIL methods across several challenging tasks.
Abstract（参考訳）: 逆模倣学習(Adversarial mimicion learning, AIL)は、ニューラルネットワークの近似による実践的成功である。しかし、既存の AIL の理論解析は主に表や線形関数近似のような単純化された設定に限られており、実用的な実装を妨げる複雑なアルゴリズム設計を含んでいる。これは理論と実践の間に大きなギャップを生じさせる。本稿では,一般関数近似によるオンラインAILの理論的基盤を探ることで,このギャップを埋める。本稿では、報酬学習のためのオンライン最適化とポリシー学習のための最適化を併用した、最適化ベースのAIL(OPT-AIL)という新しいフレームワークを紹介する。本フレームワークでは,モデルフリー OPT-AIL とモデルベース OPT-AIL の2つの具体的な手法を開発した。理論的解析により, 両変種は, ほぼ専門的な政策を学習するために, 多項式エキスパート標本の複雑さと相互作用の複雑さを達成できることが示されている。我々の知る限りでは、これらの手法は一般関数近似における最初の証明可能な効率的な AIL 手法である。実用の観点からは、OPT-AILは2つの目的の近似最適化しか必要とせず、実用的な実装を容易にする。実証的研究により、OPT-AILは、過去の最先端の深層AIL手法よりも、いくつかの課題において優れていたことが示されている。

論文の概要: Adversarial Imitation Learning with General Function Approximation: Theoretical Analysis and Practical Algorithms

関連論文リスト