Fugu-MT 論文翻訳(概要): Extending Minimal Pairs with Ordinal Surprisal Curves and Entropy Across Applied Domains

論文の概要: Extending Minimal Pairs with Ordinal Surprisal Curves and Entropy Across Applied Domains

arxiv url: http://arxiv.org/abs/2603.14400v1
Date: Sun, 15 Mar 2026 14:31:00 GMT
ステータス: 翻訳完了
システム内更新日: 2026-03-17 16:19:35.78946
Title: Extending Minimal Pairs with Ordinal Surprisal Curves and Entropy Across Applied Domains
Title（参考訳）: 正規部分曲線とエントロピーを応用領域に拡張した最小対
Authors: Andrew Katz,
Abstract要約: 本研究は,二項文法から順序スケールの分類とスコアリングタスクまで,素性に基づく評価を拡張した。モデルに回答を求めるのではなく、情報理論の「サプライズ」を測る。この枠組みは, 社会・生態・技術系の分類, 因果文の識別(バイナリとスケール), 図形言語検出, 帰納的定性的符号化の4分野にまたがる。
参考スコア（独自算出の注目度）: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: The minimal pairs paradigm of comparing model probabilities for contrasting completions has proven useful for evaluating linguistic knowledge in language models, yet its application has largely been confined to binary grammaticality judgments over syntactic phenomena. Additionally, standard prompting-based evaluation requires expensive text generation, may elicit post-hoc rationalizations rather than model judgments, and discards information about model uncertainty. We address both limitations by extending surprisal-based evaluation from binary grammaticality contrasts to ordinal-scaled classification and scoring tasks across multiple domains. Rather than asking models to generate answers, we measure the information-theoretic "surprise" (negative log probability) they assign to each position on rating scales (e.g., 1-5 or 1-9), yielding full surprisal curves that reveal both the model's preferred response and its uncertainty via entropy. We explore this framework across four domains: social-ecological-technological systems classification, causal statement identification (binary and scaled), figurative language detection, and deductive qualitative coding. Across these domains, surprisal curves produce interpretable classification signals with clear minima near expected ordinal scale positions, and entropy over the completion tended to distinguish genuinely ambiguous items from easier items.
Abstract（参考訳）: コントラスト補完のためのモデル確率を比較する最小対のパラダイムは言語モデルにおける言語知識を評価するのに有用であることが証明されているが、その応用は構文現象よりも二項文法的判断に限られている。さらに、標準的なプロンプトベースの評価には、高価なテキスト生成が必要であり、モデル判断よりもホック後の合理化を誘発し、モデルの不確実性に関する情報を捨てる可能性がある。複数の領域にまたがる順序スケールの分類やスコアリングタスクとは対照的な二項文法性に基づく予備的評価を拡張することで、両方の制約に対処する。モデルに回答を求めるのではなく、評価尺度(例えば、1-5または1-9)で各位置を割り当てる情報理論的「サプライズ」(負の対数確率)を測定し、モデルが好む応答とエントロピーによる不確実性の両方を明らかにする完全な予備曲線を生成する。この枠組みは, 社会・生態・技術系の分類, 因果文の識別(バイナリとスケール), 図形言語検出, 帰納的定性的符号化の4分野にまたがる。これらの領域全体にわたって、原始曲線は、期待される順序スケール位置付近で明らかな最小値を持つ解釈可能な分類信号を生成し、完了に対するエントロピーは、真にあいまいな項目とより簡単な項目を区別する傾向にあった。

論文の概要: Extending Minimal Pairs with Ordinal Surprisal Curves and Entropy Across Applied Domains

関連論文リスト