Fugu-MT 論文翻訳(概要): To Words and Beyond: Probing Large Language Models for Sentence-Level Psycholinguistic Norms of Memorability and Reading Times

論文の概要: To Words and Beyond: Probing Large Language Models for Sentence-Level Psycholinguistic Norms of Memorability and Reading Times

arxiv url: http://arxiv.org/abs/2603.12105v1
Date: Thu, 12 Mar 2026 16:10:27 GMT
ステータス: 翻訳完了
システム内更新日: 2026-03-13 14:46:26.203462
Title: To Words and Beyond: Probing Large Language Models for Sentence-Level Psycholinguistic Norms of Memorability and Reading Times
Title（参考訳）: To Words and Beyond: Probing Large Language Models for Sentence-Level Psycholinguistic Norms of Memorability and Reading Times (英語)
Authors: Thomas Hikaru Clark, Carlos Arriaga, Javier Conde, Gonzalo Martínez, Pedro Reviriego,
Abstract要約: 大規模言語モデルは人間の判断と相関する心理言語学の規範を推定する。我々はこの手法を文の暗記性や読解時間の未検討の特徴に拡張する。
参考スコア（独自算出の注目度）: 4.5166266531313966
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Large Language Models (LLMs) have recently been shown to produce estimates of psycholinguistic norms, such as valence, arousal, or concreteness, for words and multiword expressions, that correlate with human judgments. These estimates are obtained by prompting an LLM, in zero-shot fashion, with a question similar to those used in human studies. Meanwhile, for other norms such as lexical decision time or age of acquisition, LLMs require supervised fine-tuning to obtain results that align with ground-truth values. In this paper, we extend this approach to the previously unstudied features of sentence memorability and reading times, which involve the relationship between multiple words in a sentence-level context. Our results show that via fine-tuning, models can provide estimates that correlate with human-derived norms and exceed the predictive power of interpretable baseline predictors, demonstrating that LLMs contain useful information about sentence-level features. At the same time, our results show very mixed zero-shot and few-shot performance, providing further evidence that care is needed when using LLM-prompting as a proxy for human cognitive measures.
Abstract（参考訳）: 大規模言語モデル (LLMs) は、人間の判断と相関する単語や多語表現の原子価、覚醒、具体性などの心理学的規範の見積もりを生成することが最近示されている。これらの推定は、人間の研究で使われているものと類似した質問をゼロショットでLLMに促すことによって得られる。一方、語彙決定時間や取得年齢などの他の規範では、LLMは、接地真実の値と整合する結果を得るために、監督された微調整を必要とする。本稿では,文レベルの文脈における複数の単語間の関係を考慮に入れた,文の記憶可能性と読解時間に関する未検討の特徴にアプローチを拡張した。この結果から, モデルが人間由来のノルムと相関し, 解釈可能なベースライン予測器の予測能力を超え, LLMが文レベルの特徴に関する有用な情報を含んでいることを示す。同時に, ゼロショットと少数ショットのパフォーマンスが極めて混在しており, 人間の認知指標の指標としてLDMプロンプティングを用いる場合, 注意が必要であるという証拠が得られた。

論文の概要: To Words and Beyond: Probing Large Language Models for Sentence-Level Psycholinguistic Norms of Memorability and Reading Times

関連論文リスト