Fugu-MT 論文翻訳(概要): Easier to Judge than to Find: Predicting In-Context Learning Success for Demonstration Selection

論文の概要: Easier to Judge than to Find: Predicting In-Context Learning Success for Demonstration Selection

arxiv url: http://arxiv.org/abs/2605.18512v1
Date: Mon, 18 May 2026 15:03:59 GMT
ステータス: 翻訳完了
システム内更新日: 2026-05-19 17:57:49.893626
Title: Easier to Judge than to Find: Predicting In-Context Learning Success for Demonstration Selection
Title（参考訳）: 判断より判断し易い:演目選択における文脈内学習の成功を予測する
Authors: Haochun Wang, Chaofen Yang, Jiatong Liu, Jingbo Wang, Zewen Qiang, Sendong Zhao, Bing Qin, Ting Liu,
Abstract要約: インコンテキスト学習(ICL)は、どのデモがプロンプトに現れるかに非常に敏感である。クエリを困難に階層化するサンプル・アンド・ジャッジ・フレームワークであるDiSPを提案する。
参考スコア（独自算出の注目度）: 39.09905609945585
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: In-context learning (ICL) is highly sensitive to which demonstrations appear in the prompt, but selecting them is expensive because the space of possible demonstration contexts and combinations is enormous. We argue that demonstration selection is \emph{easier to judge than to find}: predicting whether a specific query--context pair $(q,D)$ will succeed is cheaper and more general than searching for an optimal $D^\star$. Based on this insight, we propose DiSP, a sample-and-judge framework that stratifies queries by difficulty. DiSP runs random demonstration trials to estimate success rate of each training query, trains a lightweight router to predict difficulty from the query, and trains level-specific judges for sampled demonstrations. At inference, DiSP performs stop-on-acceptance judging under an explicit budget, emitting diagnostic risk tags when no suitable context is found. Across five classification datasets with Llama~3--8B and Qwen~2.5--7B, DiSP achieves the best average accuracy, improving over strong learned selection baselines by up to 3.4\%, while achieving up to $23\times$ end-to-end wall-clock speedup.
Abstract（参考訳）: インコンテキスト学習(ICL)は、どのデモがプロンプトに現れるかに非常に敏感であるが、デモコンテキストと組み合わせの空間が巨大であるため、それらを選択することは高価である。特定のクエリ-コンテキスト対$(q,D)$が成功するかどうかを予測することは、最適な$D^\star$を探すよりも安く、より一般的である。この知見に基づいて,クエリを困難に階層化するサンプル・アンド・ジャッジ・フレームワークであるDiSPを提案する。 DiSPはランダムな実証試験を実施して、各トレーニングクエリの成功率を推定し、クエリの難しさを予測するために軽量ルータをトレーニングし、サンプル化されたデモのためにレベル固有の裁判官を訓練する。推測において、DiSPは明示的な予算の下で停止許容判定を行い、適切なコンテキストが見つからなかった場合に診断リスクタグを出力する。 Llama~3--8BとQwen~2.5--7Bの5つの分類データセットで、DiSPは最高の平均精度を達成し、強力な学習ベースラインを最大3.4\%改善し、最大23\times$エンドツーエンドのウォールクロックスピードアップを達成する。

論文の概要: Easier to Judge than to Find: Predicting In-Context Learning Success for Demonstration Selection

関連論文リスト