Fugu-MT 論文翻訳(概要): Modeling the human lexicon under temperature variations: linguistic factors, diversity and typicality in LLM word associations

論文の概要: Modeling the human lexicon under temperature variations: linguistic factors, diversity and typicality in LLM word associations

arxiv url: http://arxiv.org/abs/2603.18171v1
Date: Wed, 18 Mar 2026 18:10:02 GMT
ステータス: 翻訳完了
システム内更新日: 2026-03-20 17:19:05.793652
Title: Modeling the human lexicon under temperature variations: linguistic factors, diversity and typicality in LLM word associations
Title（参考訳）: 温度変化下におけるヒト語彙のモデル化 : LLM単語関連における言語的要因,多様性,典型性
Authors: Maria Andueza Rodriguez, Marie Candito, Richard Huyghe,
Abstract要約: 本研究では,人間の言語モデルと大規模言語モデル(LLM)を比較した。単語頻度や具体性などの語彙的要因がキュー応答対に与える影響について検討する。その結果、全てのモデルが人間の頻度と具体的な傾向を反映するが、応答のばらつきや典型性が異なることがわかった。
参考スコア（独自算出の注目度）: 2.3950779725796765
License: http://creativecommons.org/licenses/by-sa/4.0/
Abstract: Large language models (LLMs) achieve impressive results in terms of fluency in text generation, yet the nature of their linguistic knowledge - in particular the human-likeness of their internal lexicon - remains uncertain. This study compares human and LLM-generated word associations to evaluate how accurately models capture human lexical patterns. Using English cue-response pairs from the SWOW dataset and newly generated associations from three LLMs (Mistral-7B, Llama-3.1-8B, and Qwen-2.5-32B) across multiple temperature settings, we examine (i) the influence of lexical factors such as word frequency and concreteness on cue-response pairs, and (ii) the variability and typicality of LLM responses compared to human responses. Results show that all models mirror human trends for frequency and concreteness but differ in response variability and typicality. Larger models such as Qwen tend to emulate a single "prototypical" human participant, generating highly typical but minimally variable responses, while smaller models such as Mistral and Llama produce more variable yet less typical responses. Temperature settings further influence this trade-off, with higher values increasing variability but decreasing typicality. These findings highlight both the similarities and differences between human and LLM lexicons, emphasizing the need to account for model size and temperature when probing LLM lexical representations.
Abstract（参考訳）: 大規模言語モデル(LLM)は、テキスト生成における流布度の観点から印象的な結果を得るが、言語知識の性質(特に内部語彙の人間的類似性)はいまだに不明である。本研究は,人間の語彙パターンを正確に捉えたモデルを評価するために,人間とLLMの生成した単語関連性を比較した。 SWOWデータセットからの英語cue-responseペアと,複数温度設定における3つのLLM(Mistral-7B,Llama-3.1-8B,Qwen-2.5-32B)の関連性について検討した。一単語の頻度、具体性等の語彙的要因がcue-responseペアに与える影響、及び (II) LLM応答の変動と典型性は, 人間の反応と比較した。その結果、全てのモデルが人間の頻度と具体的な傾向を反映するが、応答のばらつきや典型性が異なることがわかった。 Qwenのようなより大きなモデルは、単一の「原始的な」人間の参加者をエミュレートする傾向があり、非常に典型的だが最小限の可変応答を生成する一方、MistralやLlamaのようなより小さなモデルはより可変で典型的でない応答を生成する。温度設定は、このトレードオフにさらに影響を与え、高い値が可変性を高め、典型性を低下させる。これらの知見は,LLMレキシコンとLLMレキシコンの類似点と相違点を強調し,LLMレキシコンを探索する際のモデルサイズと温度を考慮する必要性を強調した。

論文の概要: Modeling the human lexicon under temperature variations: linguistic factors, diversity and typicality in LLM word associations

関連論文リスト