Fugu-MT 論文翻訳(概要): The Frequency Confound in Language-Model Surprisal and Metaphor Novelty

論文の概要: The Frequency Confound in Language-Model Surprisal and Metaphor Novelty

arxiv url: http://arxiv.org/abs/2605.06506v1
Date: Thu, 07 May 2026 16:20:37 GMT
ステータス: 翻訳完了
システム内更新日: 2026-05-08 22:27:11.987647
Title: The Frequency Confound in Language-Model Surprisal and Metaphor Novelty
Title（参考訳）: 言語モデルにおけるサブプライズとメタファーの出現頻度
Authors: Omar Momen, Sina Zarrieß,
Abstract要約: 我々は,8つのPythiaモデルサイズと154のトレーニングチェックポイントから,予備的な推定値を分析する。設定全体では、単語の頻度はサブプライムよりもメタファーの斬新さを強く予測する。これらの結果は、しばしば報告される最適LM設定は、文脈予測可能性とメタファーの新規性と処理難易度を誤って関連付けているのに対し、語彙周波数は主要な要因である可能性があることを示唆している。
参考スコア（独自算出の注目度）: 12.10361869131849
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Language-model (LM) surprisal is widely used as a proxy for contextual predictability and has been reported to correlate with metaphor novelty judgments. However, surprisal is tightly intertwined with lexical frequency. We explore this interaction on metaphor novelty ratings using two different word frequency measures. We analyse surprisal estimates from eight Pythia model sizes and 154 training checkpoints. Across settings, word frequency is a stronger predictor of metaphor novelty than surprisal. Across training stages, the surprisal--novelty association peaks at an early stage and then falls again, mirroring a similarly timed increase in the surprisal--frequency association. These results suggest that the often-reported optimal LM surprisal settings may incorrectly associate contextual predictability with metaphor novelty and processing difficulty, whereas lexical frequency may be the major underlying factor.
Abstract（参考訳）: 言語モデル (LM) は文脈的予測可能性の代名詞として広く用いられ、メタファの新規性判断と相関することが報告されている。しかし、仮定は語彙周波数と密接に絡み合っている。この相互作用を2つの異なる単語頻度尺度を用いてメタファーのノベルティ評価について検討する。我々は,8つのPythiaモデルサイズと154のトレーニングチェックポイントから,予備的な推定値を分析する。設定全体では、単語の頻度はサブプライムよりもメタファーの斬新さを強く予測する。訓練段階全体では、サブプライム・ノーベルティ・アソシエーションは早期にピークを迎え、その後再び転倒し、サブプライム・フォーベルティ・アソシエーションの時間的増加を反映している。これらの結果は、しばしば報告される最適LM設定は、文脈予測可能性とメタファーの新規性と処理難易度を誤って関連付けているのに対し、語彙周波数は主要な要因である可能性があることを示唆している。

論文の概要: The Frequency Confound in Language-Model Surprisal and Metaphor Novelty

関連論文リスト