Fugu-MT 論文翻訳(概要): Positional Cognitive Specialization: Where Do LLMs Learn To Comprehend and Speak Your Language?

論文の概要: Positional Cognitive Specialization: Where Do LLMs Learn To Comprehend and Speak Your Language?

arxiv url: http://arxiv.org/abs/2604.00923v1
Date: Wed, 01 Apr 2026 14:03:51 GMT
ステータス: 翻訳完了
システム内更新日: 2026-04-02 16:44:32.024155
Title: Positional Cognitive Specialization: Where Do LLMs Learn To Comprehend and Speak Your Language?
Title（参考訳）: 位置認知のスペシャライゼーション:LLMはどのようにして言語を理解し、話すのか?
Authors: Luis Frentzen Salim, Lun-Wei Ku, Hsing-Kuo Kenneth Pao,
Abstract要約: 言語モデルの異なる領域において、知覚的および生産的特殊化がどのように現れるかを示す。我々はCogSymを提案する。CogSymは、初期層と後期層のみを微調整することで、効果的な適応を可能にするレイヤワイド特殊化である。
参考スコア（独自算出の注目度）: 7.398212299621878
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Adapting large language models (LLMs) to new languages is an expensive and opaque process. Understanding how language models acquire new languages and multilingual abilities is key to achieve efficient adaptation. Prior work on multilingual interpretability research focuses primarily on how trained models process multilingual instructions, leaving unexplored the mechanisms through which they acquire new languages during training. We investigate these training dynamics on decoder-only transformers through the lens of two functional cognitive specializations: language perception (input comprehension) and production (output generation). Through experiments on low-resource languages, we demonstrate how perceptual and productive specialization emerges in different regions of a language model by running layer ablation sweeps from the model's input and output directions. Based on the observed specialization patterns, we propose CogSym, a layer-wise heuristic that enables effective adaptation by exclusively fine-tuning a few early and late layers. We show that tuning only the 25% outermost layers achieves downstream task performance within 2-3% deviation from the full fine-tuning baseline. CogSym yields consistent performance with adapter methods such as LoRA, showcasing generalization beyond full fine-tuning. These findings provide insights to better understand how LLMs learn new languages and push toward accessible and inclusive language modeling.
Abstract（参考訳）: 大規模言語モデル(LLM)を新しい言語に適応させることは、高価で不透明なプロセスである。言語モデルが新しい言語と多言語能力をどのように獲得するかを理解することは、効率的な適応を実現するための鍵となる。マルチリンガル解釈可能性の研究は、主に訓練されたモデルがマルチリンガル命令をどのように処理するかに焦点を当てており、トレーニング中に新しい言語を取得するメカニズムを探索していないままである。言語知覚(インプット理解)と生産(アウトプット生成)の2つの機能的認知特殊化のレンズを通して,デコーダのみの変換器におけるこれらのトレーニングダイナミクスについて検討する。低リソース言語の実験を通じて,各言語モデルの入力方向と出力方向から層アブレーションを実行することによって,知覚的および生産的特殊化が言語モデルの異なる領域でどのように現れるかを示す。観察された特殊化パターンに基づいて,いくつかの初期層と後期層のみを微調整することで,効果的な適応を可能にする層ワイドヒューリスティックCagSymを提案する。そこで本研究では, 25%外層のみをチューニングすることで, 完全微調整ベースラインから2～3%のずれで下流タスク性能が得られることを示す。 CogSymはLoRAのようなアダプタメソッドと一貫した性能を示し、完全な微調整以上の一般化を示す。これらの発見は、LLMが新しい言語をどのように学習するかをよりよく理解し、アクセシブルで包括的な言語モデリングへと進むための洞察を与えてくれる。

論文の概要: Positional Cognitive Specialization: Where Do LLMs Learn To Comprehend and Speak Your Language?

関連論文リスト