Fugu-MT 論文翻訳(概要): From Texts to Scores: Tracing the Emergence of Essay Quality Representations in Large Language Models

論文の概要: From Texts to Scores: Tracing the Emergence of Essay Quality Representations in Large Language Models

arxiv url: http://arxiv.org/abs/2606.20152v1
Date: Thu, 18 Jun 2026 12:18:54 GMT
ステータス: 翻訳完了
システム内更新日: 2026-06-19 18:23:39.844187
Title: From Texts to Scores: Tracing the Emergence of Essay Quality Representations in Large Language Models
Title（参考訳）: テキストからスコアへ:大規模言語モデルにおける評価品質表現の出現の追跡
Authors: Jiaxu Zuo, Mu You, Kaixin Lan, Tao Fang, Yujia Huo, Henghua Shen, Lidia S. Chao, Derek F. Wong,
Abstract要約: 我々は,エッセイの品質情報が大規模言語モデル内で線形にアクセス可能な形式で符号化されているという一貫した証拠を見出した。我々は、ニューロンの活性化がエッセイスコアと強く相関し、標的介入に敏感な行動を示す「ニューロン」を同定する。
参考スコア（独自算出の注目度）: 39.459106293017584
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Recent advances in Large Language Models (LLMs) have substantially transformed Automated Essay Scoring (AES), yet the internal mechanisms underlying LLM-based scoring remain poorly understood. In this work, we systematically analyze the hidden representations of eight LLMs across two English essay datasets (ASAP++, CSEE) and one Portuguese dataset (ENEM). Using linear probing, cross-prompt generalization, dimensionality reduction, and neuron-level analyses, we find consistent evidence that essay quality information is encoded in a linearly accessible form within LLM representations. These representations emerge progressively across layers, remain robust across prompting strategies, and partially transfer across essay prompts despite differences in scoring rubrics. In addition, nonlinear probes provide only marginal and inconsistent improvements over linear probes, suggesting that most essay quality information is already linearly decodable. We further identify individual ``essay scoring neurons'' whose activations strongly correlate with essay scores and whose behavior is sensitive to targeted intervention. Moreover, the layer-wise distribution of these neurons systematically shifts with essay length, with longer essays relying more heavily on deeper layers. Overall, our findings provide evidence that LLMs encode structured representations related to essay quality and offer new insights into the interpretability of LLM-based AES systems.
Abstract（参考訳）: 近年のLarge Language Models (LLMs) の進歩は, AES (Automated Essay Scoring) に大きく変化している。本研究では,2つの英語エッセイデータセット (ASAP++, CSEE) と1つのポルトガル語データセット (ENEM) にまたがる8つのLLMの隠れ表現を系統的に解析する。線形探索,クロスプロンプト一般化,次元減少,ニューロンレベルの解析を用いて,エッセイの品質情報がLLM表現の中で線形にアクセス可能な形で符号化されているという一貫した証拠を見出した。これらの表現は層をまたいで徐々に現れ、戦略を推し進める上で頑丈なままであり、楽譜の相違にもかかわらずエッセイのプロンプトを部分的に移動している。さらに、非線形プローブは、線形プローブよりも限界と矛盾した改善しか提供せず、ほとんどのエッセイの品質情報は既に線形デオード可能であることを示唆している。さらに,活性化がエッセイスコアと強く相関し,標的介入に敏感な行動を示す「評価スコアニューロン」を同定した。さらに、これらのニューロンの層的分布はエッセイの長さによって体系的に変化し、より深い層に強く依存する長いエッセイが現れる。本研究は,LLMがエッセイ品質に関連する構造的表現を符号化し,ALSシステムの解釈可能性に関する新たな知見を提供するものであることを示す。

論文の概要: From Texts to Scores: Tracing the Emergence of Essay Quality Representations in Large Language Models

関連論文リスト