Fugu-MT 論文翻訳(概要): Multi-Fidelity Multi-Armed Bandits Revisited

論文の概要: Multi-Fidelity Multi-Armed Bandits Revisited

arxiv url: http://arxiv.org/abs/2306.07761v1
Date: Tue, 13 Jun 2023 13:19:20 GMT
ステータス: 翻訳完了
システム内更新日: 2023-06-14 13:42:54.151834
Title: Multi-Fidelity Multi-Armed Bandits Revisited
Title（参考訳）: マルチフィデリティマルチアーマッドバンドの再検討
Authors: Xuchuang Wang, Qingyun Wu, Wei Chen, John C.S. Lui
Abstract要約: 我々は,MF-MAB(Multi-fidelity multi-armed bandit)問題の拡張であるMF-MAB(Multi-fidelity multi-armed bandit)について検討した。 MF-MABは、各アームを異なるコスト(忠実さ)と観察精度で引っ張ることができる。
参考スコア（独自算出の注目度）: 46.19926456682379
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We study the multi-fidelity multi-armed bandit (MF-MAB), an extension of the canonical multi-armed bandit (MAB) problem. MF-MAB allows each arm to be pulled with different costs (fidelities) and observation accuracy. We study both the best arm identification with fixed confidence (BAI) and the regret minimization objectives. For BAI, we present (a) a cost complexity lower bound, (b) an algorithmic framework with two alternative fidelity selection procedures, and (c) both procedures' cost complexity upper bounds. From both cost complexity bounds of MF-MAB, one can recover the standard sample complexity bounds of the classic (single-fidelity) MAB. For regret minimization of MF-MAB, we propose a new regret definition, prove its problem-independent regret lower bound $\Omega(K^{1/3}\Lambda^{2/3})$ and problem-dependent lower bound $\Omega(K\log \Lambda)$, where $K$ is the number of arms and $\Lambda$ is the decision budget in terms of cost, and devise an elimination-based algorithm whose worst-cost regret upper bound matches its corresponding lower bound up to some logarithmic terms and, whose problem-dependent bound matches its corresponding lower bound in terms of $\Lambda$.
Abstract（参考訳）: 我々は,MF-MAB(Multi-fidelity multi-armed bandit)問題の拡張について検討した。 MF-MABは、各アームを異なるコスト(忠実さ)と観察精度で引っ張ることができる。最善の腕識別を固定信頼度(bai)と後悔の最小化目標の両方で検討した。 BAIについて (a)コストの複雑さが低いこと b) 2つの代替忠実度選択手順をもつアルゴリズムの枠組み、及び (c)双方の手続の費用の複雑さが上限を超えること。 MF-MABの両コスト複雑性境界から、古典的(単一忠実性)MABの標準サンプル複雑性境界を復元することができる。 For regret minimization of MF-MAB, we propose a new regret definition, prove its problem-independent regret lower bound $\Omega(K^{1/3}\Lambda^{2/3})$ and problem-dependent lower bound $\Omega(K\log \Lambda)$, where $K$ is the number of arms and $\Lambda$ is the decision budget in terms of cost, and devise an elimination-based algorithm whose worst-cost regret upper bound matches its corresponding lower bound up to some logarithmic terms and, whose problem-dependent bound matches its corresponding lower bound in terms of $\Lambda$.

関連論文リスト

Influential Bandits: Pulling an Arm May Change the Environment [44.71145269686588]
現実世界のアプリケーションは、しばしば非定常環境と武器間の相互依存を含む。本稿では,未知の,対称な正の半定値相互作用行列による腕間相互作用をモデル化する,影響力のあるバンドイット問題を提案する。我々は,損失ダイナミクスの構造に合わせて,低信頼境界(LCB)推定器に基づく新しいアルゴリズムを提案する。
論文参考訳（メタデータ） (2025-04-11T02:05:51Z)
Best Arm Identification with Minimal Regret [55.831935724659175]
最高の腕識別問題優雅にアマルガメートは、最小化とBAIを後悔している。エージェントの目標は、所定の信頼度で最高の腕を特定することである。二重KL-UCBアルゴリズムは、信頼度がゼロになる傾向があるため、最適性を達成する。
論文参考訳（メタデータ） (2024-09-27T16:46:02Z)
Pure Exploration for Constrained Best Mixed Arm Identification with a Fixed Budget [6.22018632187078]
固定予算の制約付きベスト・ミックスアーム識別(CBMAI)問題を導入する。目標は、与えられた学習予算$N$で、期待されるコストの制約によって期待される報酬を最大化する最高の混合アームを見つけることである。我々は、(最良の混合アームの支持の)誤識別に関する理論上の上限を提供し、予算$N$で指数関数的に崩壊することを示す。
論文参考訳（メタデータ） (2024-05-23T22:35:11Z)
Fixed-Budget Differentially Private Best Arm Identification [62.36929749450298]
差分プライバシー制約下における固定予算制度における線形包帯のベストアーム識別(BAI)について検討した。誤差確率に基づいてミニマックス下限を導出し、下限と上限が指数関数的に$T$で崩壊することを示した。
論文参考訳（メタデータ） (2024-01-17T09:23:25Z)
Complexity Analysis of a Countable-armed Bandit Problem [9.163501953373068]
遊びの地平線上で期待される累積的後悔を最小限に抑えるという古典的問題を考察する。我々は、$K=2$のとき、$mathcalOleft(log n right)$の率最適有限時間インスタンス依存後悔を実現するアルゴリズムを提案する。問題に対する後悔の順序と複雑さは、古典的MAB問題と非常に類似していることを示しているが、アルゴリズム設計における性能境界の特性と健全な側面は、後者とはかなり異なる。
論文参考訳（メタデータ） (2023-01-18T00:53:46Z)
Dueling Bandits: From Two-dueling to Multi-dueling [40.4378338001229]
エージェントが複数の選択肢を同時に比較し、最適な腕を選択することで後悔を最小限に抑える、一般的なマルチダウリングバンディット問題について検討する。この設定は従来の二段バンディット問題を一般化し、複数の選択肢に対する主観的なフィードバックを含む現実世界の多くのアプリケーションを見つける。
論文参考訳（メタデータ） (2022-11-16T22:00:54Z)
Batch-Size Independent Regret Bounds for Combinatorial Semi-Bandits with Probabilistically Triggered Arms or Independent Arms [59.8188496313214]
半帯域 (CMAB) について検討し, 半帯域 (CMAB) におけるバッチサイズ (K$) の依存性の低減に着目した。まず,確率的に引き起こされるアーム(CMAB-T)を用いたCMABの設定に対して,分散を考慮した信頼区間を持つBCUCB-Tアルゴリズムを提案する。次に,独立アームを用いた非トリガ型CMABの設定に対して,TPVM条件の非トリガ型を利用したSESCBアルゴリズムを提案する。
論文参考訳（メタデータ） (2022-08-31T13:09:39Z)
Achieving the Pareto Frontier of Regret Minimization and Best Arm Identification in Multi-Armed Bandits [91.8283876874947]
本稿では,BoBW-lil'UCB$(gamma)$アルゴリズムの設計と解析を行う。 i) RMとBAIの両方の目的に対して最適なアルゴリズムを同時に実行できないことを示す。また、BoBW-lil'UCB$(gamma)$は、時間複雑性と後悔の点で競合よりも優れていることを示す。
論文参考訳（メタデータ） (2021-10-16T17:52:32Z)
Problem Dependent View on Structured Thresholding Bandit Problems [73.70176003598449]
我々は、Thresholding Bandit problem (TBP)における問題依存体制について検討する。学習者の目的は、シーケンシャルゲームの終わりに、所定のしきい値を超える手段を持つアームセットを出力することである。コンケーブ設定と単調設定の両方で誤差の確率を上下に設定する。
論文参考訳（メタデータ） (2021-06-18T15:01:01Z)
Recurrent Submodular Welfare and Matroid Blocking Bandits [22.65352007353614]
最近の研究は、マルチアームバンディット問題(MAB)の研究に焦点をあてている。我々は、任意のマトロイドに対して$ (1 - frac1e)$-approximation を得ることのできる新しいアルゴリズムのアイデアを開発した。鍵となる要素は、相関的な(インターリーブされた)スケジューリング技術である。
論文参考訳（メタデータ） (2021-01-30T21:51:47Z)

関連論文リストは本サイト内にある論文のタイトル・アブストラクトから自動的に作成しています。

指定された論文の情報です。
本サイトの運営者は本サイト（すべての情報・翻訳含む）の品質を保証せず、本サイト（すべての情報・翻訳含む）を使用して発生したあらゆる結果について一切の責任を負いません。