Fugu-MT 論文翻訳(概要): Adaptivity and Non-stationarity: Problem-dependent Dynamic Regret for Online Convex Optimization

論文の概要: Adaptivity and Non-stationarity: Problem-dependent Dynamic Regret for Online Convex Optimization

arxiv url: http://arxiv.org/abs/2112.14368v1
Date: Wed, 29 Dec 2021 02:42:59 GMT
ステータス: 翻訳完了
システム内更新日: 2021-12-30 15:44:35.091843
Title: Adaptivity and Non-stationarity: Problem-dependent Dynamic Regret for Online Convex Optimization
Title（参考訳）: 適応性と非定常性:オンライン凸最適化における問題依存動的後悔
Authors: Peng Zhao, Yu-Jie Zhang, Lijun Zhang, Zhi-Hua Zhou
Abstract要約: 非定常環境におけるオンライン凸最適化について検討する。エンファンダイナミックな後悔をパフォーマンス指標として選びます。本研究では,スムーズさを生かし,動的後悔の中で$T$に依存する新しいオンラインアルゴリズムを提案する。
参考スコア（独自算出の注目度）: 93.14387921542709
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We investigate online convex optimization in non-stationary environments and choose the \emph{dynamic regret} as the performance measure, defined as the difference between cumulative loss incurred by the online algorithm and that of any feasible comparator sequence. Let $T$ be the time horizon and $P_T$ be the path-length that essentially reflects the non-stationarity of environments, the state-of-the-art dynamic regret is $\mathcal{O}(\sqrt{T(1+P_T)})$. Although this bound is proved to be minimax optimal for convex functions, in this paper, we demonstrate that it is possible to further enhance the guarantee for some easy problem instances, particularly when online functions are smooth. Specifically, we propose novel online algorithms that can leverage smoothness and replace the dependence on $T$ in the dynamic regret by \emph{problem-dependent} quantities: the variation in gradients of loss functions, the cumulative loss of the comparator sequence, and the minimum of the previous two terms. These quantities are at most $\mathcal{O}(T)$ while could be much smaller in benign environments. Therefore, our results are adaptive to the intrinsic difficulty of the problem, since the bounds are tighter than existing results for easy problems and meanwhile guarantee the same rate in the worst case. Notably, our algorithm requires only \emph{one} gradient per iteration, which shares the same gradient query complexity with the methods developed for optimizing the static regret. As a further application, we extend the results from the full-information setting to bandit convex optimization with two-point feedback and thereby attain the first problem-dependent dynamic regret for such bandit tasks.
Abstract（参考訳）: 非定常環境におけるオンライン凸最適化について検討し,オンラインアルゴリズムが生み出す累積損失と,実現可能なコンパレータシーケンスとの差として定義した性能指標として, 'emph{dynamic regret} を選択する。 T$を時間軸とし、$P_T$を環境の非定常性を本質的に反映するパス長とし、最先端の動的後悔は$\mathcal{O}(\sqrt{T(1+P_T)})$とする。この境界は凸関数に対してミニマックス最適であることが証明されているが,本稿では,簡単な問題,特にオンライン関数が滑らかである場合の保証をさらに強化できることを実証する。具体的には, 損失関数の勾配の変動, コンパレータ列の累積損失, および前2項の最小値によって, 滑らかさを生かして, 動的後悔における$t$への依存を<emph{problem-dependent} 量に置き換える, オンラインアルゴリズムを提案する。これらの量は少なくとも$\mathcal{O}(T)$であるが、良質な環境ではずっと小さい。したがって,本研究の結果は,既往の結果よりも厳密であり,かつ最悪の場合において同じ確率を保証できるため,本問題の本質的な難易度に適応する。このアルゴリズムは静的な後悔を最適化するために開発された手法と同じ勾配クエリの複雑さを共有する。さらなる応用として、全情報設定から2点フィードバックによる包絡最適化までの結果を拡張し、そのような包絡タスクに対する最初の問題依存動的後悔を実現する。

論文の概要: Adaptivity and Non-stationarity: Problem-dependent Dynamic Regret for Online Convex Optimization

関連論文リスト