Fugu-MT 論文翻訳(概要): Online Learning for Approximately-Convex Functions with Long-term Adversarial Constraints

論文の概要: Online Learning for Approximately-Convex Functions with Long-term Adversarial Constraints

arxiv url: http://arxiv.org/abs/2508.16992v1
Date: Sat, 23 Aug 2025 11:21:24 GMT
ステータス: 翻訳完了
システム内更新日: 2025-08-26 18:43:45.289037
Title: Online Learning for Approximately-Convex Functions with Long-term Adversarial Constraints
Title（参考訳）: 長期制約付き約凸関数のオンライン学習
Authors: Dhruv Sarkar, Samrat Mukhopadhyay, Abhishek Sinha,
Abstract要約: 本研究では, 長期的予算制約を伴うオンライン学習問題を, 対戦環境において検討する。この問題では、各ラウンド$t$で、学習者は凸集合からアクションを選択し、その後、相手がコスト関数を明らかにする。提案手法は, 保証を改善した$textttAdversas$問題に対して, 効率的なソリューションを提供する。
参考スコア（独自算出の注目度）: 8.314956969778692
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: We study an online learning problem with long-term budget constraints in the adversarial setting. In this problem, at each round $t$, the learner selects an action from a convex decision set, after which the adversary reveals a cost function $f_t$ and a resource consumption function $g_t$. The cost and consumption functions are assumed to be $\alpha$-approximately convex - a broad class that generalizes convexity and encompasses many common non-convex optimization problems, including DR-submodular maximization, Online Vertex Cover, and Regularized Phase Retrieval. The goal is to design an online algorithm that minimizes cumulative cost over a horizon of length $T$ while approximately satisfying a long-term budget constraint of $B_T$. We propose an efficient first-order online algorithm that guarantees $O(\sqrt{T})$ $\alpha$-regret against the optimal fixed feasible benchmark while consuming at most $O(B_T \log T)+ \tilde{O}(\sqrt{T})$ resources in both full-information and bandit feedback settings. In the bandit feedback setting, our approach yields an efficient solution for the $\texttt{Adversarial Bandits with Knapsacks}$ problem with improved guarantees. We also prove matching lower bounds, demonstrating the tightness of our results. Finally, we characterize the class of $\alpha$-approximately convex functions and show that our results apply to a broad family of problems.
Abstract（参考訳）: 本研究では, 長期的予算制約を伴うオンライン学習問題を, 対戦環境において検討する。この問題では、各ラウンド$t$において、学習者が凸決定セットからアクションを選択し、その後、相手がコスト関数$f_t$とリソース消費関数$g_t$とを明らかにする。コストと消費関数は、凸性を一般化し、DR-部分モジュラー最大化、オンライン頂点被覆、正規化位相検索を含む多くの一般的な非凸最適化問題を包含する広いクラスである。目標は、長期予算の制約を約$B_T$で満たしながら、長さの水平線上で累積コストを最小化するオンラインアルゴリズムを設計することである。我々は,最大$O(B_T \log T)+ \tilde{O}(\sqrt{T})$リソースを全情報および帯域フィードバック設定の両方で消費しながら,最適な固定可能なベンチマークに対して$O(\sqrt{T})$$\alpha$-regretを保証する効率的な一階オンラインアルゴリズムを提案する。バンドイットフィードバック設定では、Knapsacks を用いた $\texttt{Adversarial Bandits の効率の良い解が得られる。また、一致した下限を証明し、その結果の厳密さを示す。最後に、$\alpha$-aqua convex関数のクラスを特徴づけ、その結果が幅広い問題群に適用可能であることを示す。

論文の概要: Online Learning for Approximately-Convex Functions with Long-term Adversarial Constraints

関連論文リスト