Fugu-MT 論文翻訳(概要): A Two-Timescale Framework for Bilevel Optimization: Complexity Analysis and Application to Actor-Critic

論文の概要: A Two-Timescale Framework for Bilevel Optimization: Complexity Analysis and Application to Actor-Critic

arxiv url: http://arxiv.org/abs/2007.05170v4
Date: Wed, 8 Jun 2022 05:49:52 GMT
ステータス: 翻訳完了
システム内更新日: 2022-11-11 22:45:10.009769
Title: A Two-Timescale Framework for Bilevel Optimization: Complexity Analysis and Application to Actor-Critic
Title（参考訳）: 2段階最適化のための2時間フレームワーク:複雑度解析とアクタクリティカルへの応用
Authors: Mingyi Hong, Hoi-To Wai, Zhaoran Wang, and Zhuoran Yang
Abstract要約: 双レベル最適化は、2レベル構造を示す問題のクラスである。このような二段階問題に対処するための2段階近似(TTSA)アルゴリズムを提案する。本稿では,TTSAフレームワークの特殊な事例として,2段階の自然なアクター・クリティカルポリシー最適化アルゴリズムが有用であることを示す。
参考スコア（独自算出の注目度）: 142.1492359556374
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: This paper analyzes a two-timescale stochastic algorithm framework for bilevel optimization. Bilevel optimization is a class of problems which exhibit a two-level structure, and its goal is to minimize an outer objective function with variables which are constrained to be the optimal solution to an (inner) optimization problem. We consider the case when the inner problem is unconstrained and strongly convex, while the outer problem is constrained and has a smooth objective function. We propose a two-timescale stochastic approximation (TTSA) algorithm for tackling such a bilevel problem. In the algorithm, a stochastic gradient update with a larger step size is used for the inner problem, while a projected stochastic gradient update with a smaller step size is used for the outer problem. We analyze the convergence rates for the TTSA algorithm under various settings: when the outer problem is strongly convex (resp.~weakly convex), the TTSA algorithm finds an $\mathcal{O}(K^{-2/3})$-optimal (resp.~$\mathcal{O}(K^{-2/5})$-stationary) solution, where $K$ is the total iteration number. As an application, we show that a two-timescale natural actor-critic proximal policy optimization algorithm can be viewed as a special case of our TTSA framework. Importantly, the natural actor-critic algorithm is shown to converge at a rate of $\mathcal{O}(K^{-1/4})$ in terms of the gap in expected discounted reward compared to a global optimal policy.
Abstract（参考訳）: 本稿では,2段階最適化のための2段階確率アルゴリズムフレームワークを解析する。双レベル最適化は、2レベル構造を示す問題のクラスであり、その目標は、(内)最適化問題の最適解となるよう制約された変数を持つ外部目的関数を最小化することである。内問題に制約がなく,強い凸がある場合,外問題に制約があり,目的関数が滑らかな場合を考える。このような二段階問題に対処するための2段階確率近似(TTSA)アルゴリズムを提案する。このアルゴリズムでは、内側の問題にはより大きなステップサイズを持つ確率的勾配更新を用い、外側問題にはより小さなステップサイズで投影された確率的勾配更新を用いる。 TTSAアルゴリズムは,外部問題が強い凸(resp.〜weakly convex)の場合,$\mathcal{O}(K^{-2/3})$-optimal(resp.〜weakly convex)を求める。 ~$\mathcal{o}(k^{-2/5})$-stationary)解、ここでは$k$は総イテレーション数である。アプリケーションとして,TTSAフレームワークの特殊な事例として,2段階の自然なアクター・クリティカル・ポリシー最適化アルゴリズムが利用できることを示す。重要なことに、自然なアクター批判アルゴリズムは、大域的最適ポリシーと比較して、期待される割引報酬のギャップの観点から$\mathcal{O}(K^{-1/4})$で収束することが示されている。

論文の概要: A Two-Timescale Framework for Bilevel Optimization: Complexity Analysis and Application to Actor-Critic

関連論文リスト