Fugu-MT 論文翻訳(概要): Partial Feedback Online Learning

論文の概要: Partial Feedback Online Learning

arxiv url: http://arxiv.org/abs/2601.21462v2
Date: Thu, 05 Feb 2026 02:57:08 GMT
ステータス: 翻訳完了
システム内更新日: 2026-02-06 14:11:23.802854
Title: Partial Feedback Online Learning
Title（参考訳）: オンライン学習の部分的フィードバック
Authors: Shihao Shao, Cong Fang, Zhouchen Lin, Dacheng Tao,
Abstract要約: 我々は、偏見フィードバックオンライン学習と呼ばれる新しい学習プロトコルについて研究する。各インスタンスは許容できるラベルのセットを許可するが、学習者は1ラウンドごとに許容できるラベルを1つだけ観察する。
参考スコア（独自算出の注目度）: 88.27143767009376
License: http://creativecommons.org/licenses/by/4.0/
Abstract: We study a new learning protocol, termed partial-feedback online learning, where each instance admits a set of acceptable labels, but the learner observes only one acceptable label per round. We highlight that, while classical version space is widely used for online learnability, it does not directly extend to this setting. We address this obstacle by introducing a collection version space, which maintains sets of hypotheses rather than individual hypotheses. Using this tool, we obtain a tight characterization of learnability in the set-realizable regime. In particular, we define the Partial-Feedback Littlestone dimension (PFLdim) and the Partial-Feedback Measure Shattering dimension (PMSdim), and show that they tightly characterize the minimax regret for deterministic and randomized learners, respectively. We further identify a nested inclusion condition under which deterministic and randomized learnability coincide, resolving an open question of Raman et al. (2024b). Finally, given a hypothesis space H, we show that beyond set realizability, the minimax regret can be linear even when |H|=2, highlighting a barrier beyond set realizability.
Abstract（参考訳）: 本研究では,学習者が許容するラベルのセットを各インスタンスに付与する部分フィードバックオンライン学習と呼ばれる新しい学習プロトコルについて検討するが,学習者は1ラウンドごとに許容できるラベルを1つだけ観察する。古典的なバージョン空間はオンライン学習に広く利用されているが、直接この設定に拡張されていない点を強調した。個々の仮説よりも仮説の集合を保守するコレクションバージョン空間を導入することで、この障害に対処する。このツールを用いて,設定可能システムにおける学習可能性の厳密な評価を得る。特に,部分フィードバックリトルストーン次元 (PFLdim) と部分フィードバック尺度シェータリング次元 (PMSdim) を定義し,決定論的およびランダムな学習者に対して,それぞれミニマックス後悔を強く特徴付けることを示す。さらに,決定論的かつランダムな学習性が一致するネスト包含条件を特定し,Raman et al (2024b) の解答を行う。最後に、仮説空間 H が与えられたとき、集合実現可能性を超えると、ミニマックス後悔は |H|=2 であっても線型となり、集合実現可能性を超えた障壁を浮き彫りにする。

論文の概要: Partial Feedback Online Learning

関連論文リスト