Fugu-MT 論文翻訳(概要): One Good Source is All You Need: Near-Optimal Regret for Bandits under Heterogeneous Noise

論文の概要: One Good Source is All You Need: Near-Optimal Regret for Bandits under Heterogeneous Noise

arxiv url: http://arxiv.org/abs/2602.14474v1
Date: Mon, 16 Feb 2026 05:25:06 GMT
ステータス: 翻訳完了
システム内更新日: 2026-02-17 16:22:50.160266
Title: One Good Source is All You Need: Near-Optimal Regret for Bandits under Heterogeneous Noise
Title（参考訳）: 不均質な騒音下でのバンドの最適レグレット
Authors: Aadirupa Saha, Amith Bhat, Haipeng Luo,
Abstract要約: Source-Optimistic Adaptive Regret Minimization (SOAR) は、シャープな分散集中境界を用いて高分散ソースを創出する新しいアルゴリズムである。我々は、標準の単一ソースMABのインスタンス依存の最適後悔を、分散$*2$で達成していることを示す。我々の理論的境界は、提案されたベースラインよりも大幅に改善されている。
参考スコア（独自算出の注目度）: 49.12618706309658
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We study $K$-armed Multiarmed Bandit (MAB) problem with $M$ heterogeneous data sources, each exhibiting unknown and distinct noise variances $\{σ_j^2\}_{j=1}^M$. The learner's objective is standard MAB regret minimization, with the additional complexity of adaptively selecting which data source to query from at each round. We propose Source-Optimistic Adaptive Regret minimization (SOAR), a novel algorithm that quickly prunes high-variance sources using sharp variance-concentration bounds, followed by a `balanced min-max LCB-UCB approach' that seamlessly integrates the parallel tasks of identifying the best arm and the optimal (minimum-variance) data source. Our analysis shows SOAR achieves an instance-dependent regret bound of $\tilde{O}\left({σ^*}^2\sum_{i=2}^K \frac{\log T}{Δ_i} + \sqrt{K \sum_{j=1}^M σ_j^2}\right)$, up to preprocessing costs depending only on problem parameters, where ${σ^*}^2 := \min_j σ_j^2$ is the minimum source variance and $Δ_i$ denotes the suboptimality gap of the $i$-th arm. This result is both surprising as despite lacking prior knowledge of the minimum-variance source among $M$ alternatives, SOAR attains the optimal instance-dependent regret of standard single-source MAB with variance ${σ^*}^2$, while incurring only an small (and unavoidable) additive cost of $\tilde O(\sqrt{K \sum_{j=1}^M σ_j^2})$ towards the optimal (minimum variance) source identification. Our theoretical bounds represent a significant improvement over some proposed baselines, e.g. Uniform UCB or Explore-then-Commit UCB, which could potentially suffer regret scaling with $σ_{\max}^2$ in place of ${σ^*}^2$-a gap that can be arbitrarily large when $σ_{\max} \gg σ^*$. Experiments on multiple synthetic problem instances and the real-world MovieLens\;25M dataset, demonstrating the superior performance of SOAR over the baselines.
Abstract（参考訳）: 我々は、M$の不均質なデータソースを用いて、$K$武装マルチアームバンド(MAB)問題を研究し、それぞれが未知のノイズ分散を呈し、${σ_j^2\}_{j=1}^M$を示す。学習者の目的は、各ラウンドでどのデータソースをクエリするかを適応的に選択するという、標準的なMAB後悔の最小化である。本稿では,高速な分散集中バウンダリを用いて高速に高分散源を創出する新しいアルゴリズムであるソース・オプティミスティック・アダプティブ・レグレト最小化(SOAR)を提案し,次に最適なアームと最適な(最小分散)データソースを特定する並列タスクをシームレスに統合する'バランスド・min-max LCB-UCBアプローチ"を提案する。我々の分析によると、SOARはインスタンス依存のリセット境界を$\tilde{O}\left({σ^*}^2\sum_{i=2}^K \frac{\log T}{Δ_i} + \sqrt{K \sum_{j=1}^M σ_j^2}\right)$で達成している。この結果は、$M$の代替案の間で最小分散源についての事前の知識がないにもかかわらず、SOARは、${σ^*}^2$の分散を伴う標準単一ソースMABの最適インスタンス依存の後悔を達成し、また、$\tilde O(\sqrt{K \sum_{j=1}^M σ_j^2})$の小さな(かつ避けられない)付加コストのみを最適な(最小分散)ソース識別にもたらす。我々の理論的境界は、提案されたベースライン、例えば、Uniform UCB や Explore-then-Commit UCB よりも大幅に改善され、$σ_{\max}^2$ の代わりに${σ^*}^2$-a のスケーリングに後悔する可能性がある。複数の合成問題インスタンスと実世界のMovieLens\;25Mデータセットの実験は、ベースラインよりもSOARの方が優れたパフォーマンスを示している。

論文の概要: One Good Source is All You Need: Near-Optimal Regret for Bandits under Heterogeneous Noise

関連論文リスト