Fugu-MT 論文翻訳(概要): Revisiting Projection-Free Online Learning with Time-Varying Constraints

論文の概要: Revisiting Projection-Free Online Learning with Time-Varying Constraints

arxiv url: http://arxiv.org/abs/2501.16046v1
Date: Mon, 27 Jan 2025 13:38:51 GMT
ステータス: 翻訳完了
システム内更新日: 2025-01-28 21:57:03.940819
Title: Revisiting Projection-Free Online Learning with Time-Varying Constraints
Title（参考訳）: 時間変化制約によるプロジェクションフリーオンライン学習の再検討
Authors: Yibo Wang, Yuanyu Wan, Lijun Zhang,
Abstract要約: 我々は制約付きオンライン凸最適化について検討し、そこでは決定は固定的で典型的には複雑な領域に属する必要がある。いくつかのプロジェクションフリーな手法が$mathcalO(T3/4 sqrtlog T)$ regret boundと$mathcalO(T3/4 sqrtlog T)$ cumulative constraint violation (CCV) bound for general convex lossで提案されている。本稿では,損失関数が強凸である場合に,この結果を改善し,さらにテクスチノーベルの後悔とCCV境界を確立する。
参考スコア（独自算出の注目度）: 35.573654458435854
License: http://creativecommons.org/licenses/by/4.0/
Abstract: We investigate constrained online convex optimization, in which decisions must belong to a fixed and typically complicated domain, and are required to approximately satisfy additional time-varying constraints over the long term. In this setting, the commonly used projection operations are often computationally expensive or even intractable. To avoid the time-consuming operation, several projection-free methods have been proposed with an $\mathcal{O}(T^{3/4} \sqrt{\log T})$ regret bound and an $\mathcal{O}(T^{7/8})$ cumulative constraint violation (CCV) bound for general convex losses. In this paper, we improve this result and further establish \textit{novel} regret and CCV bounds when loss functions are strongly convex. The primary idea is to first construct a composite surrogate loss, involving the original loss and constraint functions, by utilizing the Lyapunov-based technique. Then, we propose a parameter-free variant of the classical projection-free method, namely online Frank-Wolfe (OFW), and run this new extension over the online-generated surrogate loss. Theoretically, for general convex losses, we achieve an $\mathcal{O}(T^{3/4})$ regret bound and an $\mathcal{O}(T^{3/4} \log T)$ CCV bound, both of which are order-wise tighter than existing results. For strongly convex losses, we establish new guarantees of an $\mathcal{O}(T^{2/3})$ regret bound and an $\mathcal{O}(T^{5/6})$ CCV bound. Moreover, we also extend our methods to a more challenging setting with bandit feedback, obtaining similar theoretical findings. Empirically, experiments on real-world datasets have demonstrated the effectiveness of our methods.
Abstract（参考訳）: 制約付きオンライン凸最適化について検討し、決定は固定的で典型的には複雑な領域に属さなければならず、長期にわたる時間的制約をほぼ満たさなければならない。この設定では、よく使われる射影演算はしばしば計算コストがかかるか、あるいは難解である。時間のかかる操作を避けるために、$\mathcal{O}(T^{3/4} \sqrt{\log T})$ regret bound と $\mathcal{O}(T^{7/8})$ cumulative constraint violation (CCV) bound for general convex loss が提案されている。本稿では、この結果を改善し、損失関数が強く凸である場合に、さらに「textit{novel} regret」と「CCV境界」を確立する。第一の考え方は、リアプノフに基づく手法を用いて、元の損失と制約関数を含む複合代理損失を最初に構築することである。そこで本研究では,従来のプロジェクションフリー手法,すなわちオンラインのFrank-Wolfe (OFW) のパラメータフリー変種を提案し,オンラインのサロゲート損失に対して新たな拡張を実行する。理論的には、一般的な凸損失に対して、$\mathcal{O}(T^{3/4})$ regret bound と $\mathcal{O}(T^{3/4} \log T)$ CCV bound を得る。強い凸損失に対しては、$\mathcal{O}(T^{2/3})$ regret bound と $\mathcal{O}(T^{5/6})$ CCV bound の新たな保証を確立する。さらに,本手法をバンドフィードバックによるより困難な設定にまで拡張し,同様の理論的知見を得た。実世界のデータセットを用いた実験により,本手法の有効性が実証された。

関連論文リスト

Order-Optimal Projection-Free Algorithm for Adversarially Constrained Online Convex Optimization [29.705337940879705]
制約付きオンライン凸最適化(COCO)のための投影型アルゴリズムは、高次元設定においてスケーラビリティの課題に直面している。本稿では,プロジェクションの必要性を排除しつつ,最先端の性能保証を実現するCOCOのプロジェクションフリーアルゴリズムを提案する。
論文参考訳（メタデータ） (2025-02-23T23:18:40Z)
An Optimistic Algorithm for Online Convex Optimization with Adversarial Constraints [55.2480439325792]
逆制約を伴うオンライン凸最適化(OCO)について検討する。本稿では,損失関数と制約関数の予測にアルゴリズムがアクセス可能な設定に着目する。以上の結果から,現在のO(sqrtT) $ regret と $ tildeO(sqrtT) $ cumulative constraint violation の改善が期待できることがわかった。
論文参考訳（メタデータ） (2024-12-11T03:06:42Z)
Optimistic Safety for Online Convex Optimization with Unknown Linear Constraints [31.526232903811533]
我々はOCO(Optimistically Safe OCO)と呼ぶアルゴリズムを導入し、そのアルゴリズムが$tildeO(sqrtT)$ regretと制約違反がないことを示す。静的線形制約の場合、これは同じ仮定の下で、以前の最もよく知られた $tildeO(T2/3)$ regret よりも改善される。時間的制約の場合、当社の作業は、$O(sqrtT)$ regretと$O(sqrtT)$ cumulative violationを示す既存の結果を補完します。
論文参考訳（メタデータ） (2024-03-09T04:01:39Z)
Multi-point Feedback of Bandit Convex Optimization with Hard Constraints [1.8130068086063336]
本研究では,学習者が損失関数の部分的情報に基づいて決定列を生成することを目的とした制約付き帯域凸最適化について検討する。我々は、累積的テクスト制約違反を制約違反の指標として採用する。我々のアルゴリズムは、凸損失関数と時間変化制約に対して、$O(d2Tmaxc,1-c)$ regret bounds と $O(d2T1-fracc2)$ cumulative hard constraint violation bounds を得る。
論文参考訳（メタデータ） (2023-10-17T02:43:22Z)
Universal Online Learning with Gradient Variations: A Multi-layer Online Ensemble Approach [57.92727189589498]
本稿では,2段階の適応性を持つオンライン凸最適化手法を提案する。我々は$mathcalO(log V_T)$, $mathcalO(d log V_T)$, $hatmathcalO(sqrtV_T)$ regret bounds for strong convex, exp-concave and convex loss function。
論文参考訳（メタデータ） (2023-07-17T09:55:35Z)
Adaptivity and Non-stationarity: Problem-dependent Dynamic Regret for Online Convex Optimization [70.4342220499858]
本稿では,スムーズさを生かし,問題依存量による動的後悔のT$への依存を補う新しいオンラインアルゴリズムを提案する。この結果が本質的な難易度に適応しているのは, 既往の結果よりも厳密であり, 最悪の場合, 同一レートの保護が可能であるからである。
論文参考訳（メタデータ） (2021-12-29T02:42:59Z)
Regret and Cumulative Constraint Violation Analysis for Online Convex Optimization with Long Term Constraints [24.97580261894342]
本稿では,長期的制約を伴うオンライン凸最適化について考察する。新たなアルゴリズムが最初に提案され、静的後悔のために$mathcalO(Tmaxc,1-c)$bound、累積制約違反のために$mathcalO(T(1-c)/2)$boundを達成する。
論文参考訳（メタデータ） (2021-06-09T15:18:06Z)
Projection-free Online Learning over Strongly Convex Sets [24.517908972536432]
我々は,強い凸集合に対するオンライン学習の特別な事例について検討し,OWが一般凸集合の損失に対して$O(T2/3)$よりも良い後悔の限界を享受できることを最初に証明した。一般凸集合に対する$O(sqrtT)$の後悔境界と強凸集合に対する$O(sqrtT)$の後悔境界を達成することを示す。
論文参考訳（メタデータ） (2020-10-16T05:42:50Z)
Efficient Optimistic Exploration in Linear-Quadratic Regulators via Lagrangian Relaxation [107.06364966905821]
線形2次レギュレータ(LQR)設定における探索・探索ジレンマについて検討した。有限 MDP に対する楽観的アルゴリズムで用いられる拡張値反復アルゴリズムに着想を得て,Oulq の楽観的最適化を緩和することを提案する。我々は、少なくとも$Obig(log (1/epsilon)big)$ Riccati方程式を解くことで、$epsilon$-OptimisticControllerを効率的に計算できることを示した。
論文参考訳（メタデータ） (2020-07-13T16:30:47Z)
Dynamic Regret of Convex and Smooth Functions [93.71361250701075]
非定常環境におけるオンライン凸最適化について検討する。パフォーマンス指標として動的後悔を選択します。本研究では, 滑らかさを活かして, 動的後悔をさらに高めることが可能であることを示す。
論文参考訳（メタデータ） (2020-07-07T14:10:57Z)
Upper Confidence Primal-Dual Reinforcement Learning for CMDP with Adversarial Loss [145.54544979467872]
マルコフ決定過程(CMDP)に対するオンライン学習の検討本稿では,遷移モデルから標本化した軌跡のみを必要とする,新しいEmphupper confidence primal-dualアルゴリズムを提案する。我々の分析では、ラグランジュ乗算過程の新たな高確率ドリフト解析を、高信頼強化学習の記念後悔解析に組み入れている。
論文参考訳（メタデータ） (2020-03-02T05:02:23Z)

関連論文リストは本サイト内にある論文のタイトル・アブストラクトから自動的に作成しています。

指定された論文の情報です。
本サイトの運営者は本サイト（すべての情報・翻訳含む）の品質を保証せず、本サイト（すべての情報・翻訳含む）を使用して発生したあらゆる結果について一切の責任を負いません。