Fugu-MT 論文翻訳(概要): The Granularity Axis: A Micro-to-Macro Latent Direction for Social Roles in Language Models

論文の概要: The Granularity Axis: A Micro-to-Macro Latent Direction for Social Roles in Language Models

arxiv url: http://arxiv.org/abs/2605.06196v1
Date: Thu, 07 May 2026 13:08:19 GMT
ステータス: 翻訳完了
システム内更新日: 2026-05-08 22:27:11.811766
Title: The Granularity Axis: A Micro-to-Macro Latent Direction for Social Roles in Language Models
Title（参考訳）: 粒度軸:言語モデルにおける社会的役割のためのマイクロ・マクロ・ラテント方向
Authors: Chonghan Qin, Xiachong Feng, Ziyun Song, Xiaocheng Feng, Jing Xiong, Lingpeng Kong,
Abstract要約: マイクロレベルの個人経験からマクロレベルの組織,制度的,あるいは国家的推論に至るまで,大きな言語モデルが社会的役割をコードしていることを示す。以上の結果から,社会的役割の粒度は単にスタイリスティックな表面特徴ではなく,言語モデル行動における構造的,秩序的,因果的に操作可能な潜在方向である可能性が示唆された。
参考スコア（独自算出の注目度）: 61.32031248060941
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Large language models (LLMs) are routinely prompted to take on social roles ranging from individuals to institutions, yet it remains unclear whether their internal representations encode the granularity of such roles, from micro-level individual experience to macro-level organizational, institutional, or national reasoning. We show that they do. We define a contrast-based Granularity Axis as the difference between mean macro- and micro-role hidden states. In Qwen3-8B, this axis aligns with the principal axis (PC1) of the role representation space at cosine 0.972 and accounts for 52.6% of its variance, indicating that granularity is the dominant geometric axis organizing prompted social roles. We construct 75 social roles across five granularity levels and collect 91,200 role-conditioned responses over shared questions and prompt variants, then extract role-level hidden states and project them onto the axis. Role projections increase monotonically across all five levels, remain stable across layers, prompt variants, endpoint definitions, held-out splits, and score-filtered subsets, and transfer to Llama-3.1-8B-Instruct. The axis is also causally relevant: activation steering along it shifts response granularity in the predicted direction, with Llama moving from 2.00 to 3.17 on a five-point macro scale under positive steering on prompts that admit local responses. The two models differ in controllability, suggesting that steering depends on each model's default operating regime. Overall, our findings suggest that social role granularity is not merely a stylistic surface feature, but a structured, ordered, and causally manipulable latent direction in role-conditioned language model behavior.
Abstract（参考訳）: 大規模言語モデル(LLM)は、個人から機関まで、社会的役割を日常的に担うよう促されるが、その内部表現が、マイクロレベルの個人経験からマクロレベルの組織、制度的、あるいは国家的推論に至るまで、それらの役割の粒度をコードしているかどうかは不明である。私たちは彼らがそうしていることを示します。コントラストに基づくグラニュラリティ軸を平均マクロとマイクロロールの隠れ状態の差として定義する。 Qwen3-8Bでは、この軸はコサイン0.972における役割表現空間の主軸(PC1)と一致し、その分散の52.6%を占めており、粒度が社会的役割を誘導する支配的な幾何学的軸であることを示している。我々は5つの粒度レベルにわたって75の社会的役割を構築し、共有された質問に対して91,200のロール条件の応答を収集し、変種を誘導し、役割レベルの隠された状態を抽出し、それらを軸に投影する。ロールプロジェクションは5つのレベルすべてで単調に増加し、層をまたいで安定し、変種、エンドポイント定義、ホールドアウトスプリット、スコアフィルタサブセット、そしてLlama-3.1-8B-インストラクトに転送される。アクティベーションステアリングは反応の粒度を予測方向にシフトさせ、Llamaは局所応答を許容するプロンプトで5点マクロスケールで2.00から3.17に移動する。 2つのモデルは制御性が異なるため、ステアリングは各モデルのデフォルトのオペレーティングシステムに依存している。以上の結果から, 社会的役割の粒度は, 構造的, 秩序的, 因果的に操作可能な言語モデル行動の潜在方向である可能性が示唆された。

論文の概要: The Granularity Axis: A Micro-to-Macro Latent Direction for Social Roles in Language Models

関連論文リスト