Fugu-MT 論文翻訳(概要): Nonparametric Kernel Clustering with Bandit Feedback

論文の概要: Nonparametric Kernel Clustering with Bandit Feedback

arxiv url: http://arxiv.org/abs/2601.07535v1
Date: Mon, 12 Jan 2026 13:36:07 GMT
ステータス: 翻訳完了
システム内更新日: 2026-01-13 19:08:01.413515
Title: Nonparametric Kernel Clustering with Bandit Feedback
Title（参考訳）: 帯域フィードバックを用いた非パラメトリックカーネルクラスタリング
Authors: Victor Thuot, Sebastian Vogt, Debarghya Ghoshdastidar, Nicolas Verzelen,
Abstract要約: バンディットフィードバックによるクラスタリングは、クラスタリングアルゴリズムがアイテムをシーケンシャルにクエリしてノイズの多い観察を受信する、一連のアイテムを分割する問題を指す。我々はカーネルベースのアプローチを導入し、非パラメトリック問題をヒルベルトカーネル空間(RKHS)への平均埋め込みをカーネルに従ってクラスタ化するタスクとして再構築する。本稿では,アーム分布の最大平均誤差(MMD)とRKHSのばらつきに依存する問題に対して,信号対雑音比の概念を導入する。
参考スコア（独自算出の注目度）: 9.68728390492671
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Clustering with bandit feedback refers to the problem of partitioning a set of items, where the clustering algorithm can sequentially query the items to receive noisy observations. The problem is formally posed as the task of partitioning the arms of an N-armed stochastic bandit according to their underlying distributions, grouping two arms together if and only if they share the same distribution, using samples collected sequentially and adaptively. This setting has gained attention in recent years due to its applicability in recommendation systems and crowdsourcing. Existing works on clustering with bandit feedback rely on a strong assumption that the underlying distributions are sub-Gaussian. As a consequence, the existing methods mainly cover settings with linearly-separable clusters, which has little practical relevance. We introduce a framework of ``nonparametric clustering with bandit feedback'', where the underlying arm distributions are not constrained to any parametric, and hence, it is applicable for active clustering of real-world datasets. We adopt a kernel-based approach, which allows us to reformulate the nonparametric problem as the task of clustering the arms according to their kernel mean embeddings in a reproducing kernel Hilbert space (RKHS). Building on this formulation, we introduce the KABC algorithm with theoretical correctness guarantees and analyze its sampling budget. We introduce a notion of signal-to-noise ratio for this problem that depends on the maximum mean discrepancy (MMD) between the arm distributions and on their variance in the RKHS. Our algorithm is adaptive to this unknown quantity: it does not require it as an input yet achieves instance-dependent guarantees.
Abstract（参考訳）: バンディットフィードバックによるクラスタリングは、一連のアイテムを分割する問題を指し、クラスタリングアルゴリズムは、アイテムをシーケンシャルにクエリしてノイズの多い観察を受けることができる。この問題は、N腕の確率的バンディットの腕をその基礎的な分布に従って分割し、2つの腕を同じ分布を共有する場合に限って、順次、適応的に収集されたサンプルを用いてグループ化するタスクとして公式に提起される。この設定は近年、レコメンデーションシステムやクラウドソーシングの適用性から注目されている。既存のバンディットフィードバックによるクラスタリングの研究は、基礎となる分布がガウス群であるという強い仮定に依存している。その結果、既存の手法は主に線形分離可能なクラスタによる設定をカバーしており、実際的な関連性はほとんどない。本稿では,「バンドフィードバックを伴う非パラメトリッククラスタリング」というフレームワークを導入し,その基盤となるアーム分布は任意のパラメトリックに制約されず,実世界のデータセットのアクティブクラスタリングに適用可能であることを示す。我々は、カーネルベースのアプローチを採用し、非パラメトリック問題を、再現されたカーネルヒルベルト空間(RKHS)に、そのカーネル平均埋め込みに従ってアームをクラスタ化するタスクとして再構成することができる。この定式化に基づいて、理論的正当性を保証するKABCアルゴリズムを導入し、そのサンプリング予算を分析する。本稿では,アーム分布の最大平均誤差(MMD)とRKHSのばらつきに依存する問題に対して,信号対雑音比の概念を導入する。我々のアルゴリズムはこの未知の量に適応しており、入力として必要ありませんが、インスタンスに依存した保証が得られます。

論文の概要: Nonparametric Kernel Clustering with Bandit Feedback

関連論文リスト