Fugu-MT 論文翻訳(概要): Unveiling A Core Linguistic Region in Large Language Models

論文の概要: Unveiling A Core Linguistic Region in Large Language Models

arxiv url: http://arxiv.org/abs/2310.14928v1
Date: Mon, 23 Oct 2023 13:31:32 GMT
ステータス: 翻訳完了
システム内更新日: 2023-10-24 19:59:26.897016
Title: Unveiling A Core Linguistic Region in Large Language Models
Title（参考訳）: 大規模言語モデルにおけるコア言語領域の公開
Authors: Jun Zhao, Zhihao Zhang, Yide Ma, Qi Zhang, Tao Gui, Luhui Gao and Xuanjing Huang
Abstract要約: 本稿では,脳局在化をプロトタイプとして用いた類似研究を行う。我々は、言語能力に対応する大規模言語モデルにおいて、中核領域を発見した。我々は,言語能力の向上が必ずしもモデルの知識レベルの向上に伴わないことを観察する。
参考スコア（独自算出の注目度）: 49.860260050718516
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Brain localization, which describes the association between specific regions of the brain and their corresponding functions, is widely accepted in the field of cognitive science as an objective fact. Today's large language models (LLMs) possess human-level linguistic competence and can execute complex tasks requiring abstract knowledge and reasoning. To deeply understand the inherent mechanisms of intelligence emergence in LLMs, this paper conducts an analogical research using brain localization as a prototype. We have discovered a core region in LLMs that corresponds to linguistic competence, accounting for approximately 1% of the total model parameters. This core region exhibits significant dimension dependency, and perturbations to even a single parameter on specific dimensions can lead to a loss of linguistic competence. Furthermore, we observe that an improvement in linguistic competence does not necessarily accompany an elevation in the model's knowledge level, which might imply the existence of regions of domain knowledge that are dissociated from the linguistic region. Overall, exploring the LLMs' functional regions provides insights into the foundation of their intelligence. In the future, we will continue to investigate knowledge regions within LLMs and the interactions between them.
Abstract（参考訳）: 脳の特定の領域とその機能の間の関係を記述する脳局在は、認知科学の分野において客観的事実として広く受け入れられている。今日の大きな言語モデル(LLM)は人間レベルの言語能力を持ち、抽象的な知識と推論を必要とする複雑なタスクを実行することができる。 llmsにおける知性出現のメカニズムを深く理解するため,本論文では,脳の局在をプロトタイプとして用いた類似研究を行う。我々は,LLMの言語能力に対応する中核領域を発見し,全体のモデルパラメータの約1%を占めた。この中核領域は重要な次元依存性を示し、特定の次元上の1つのパラメータでさえ摂動は言語能力の喪失につながる。さらに、言語能力の向上は必ずしもモデルの知識レベルの上昇を伴うものではなく、言語領域から分離したドメイン知識の領域の存在を暗示していると考えられる。全体として、LLMの機能領域の探索は、そのインテリジェンスの基礎に関する洞察を提供する。今後,LLM内の知識領域とそれらの相互作用について検討を続ける。

関連論文リスト

Sparse Auto-Encoder Interprets Linguistic Features in Large Language Models [40.12943080113246]
スパースオートエンコーダ(SAE)を用いた系統的・包括的因果調査を提案する。 6次元から幅広い言語的特徴を抽出する。本稿では,FRC(Feature Representation Confidence)とFIC(Feature Intervention Confidence)の2つの指標を紹介する。
論文参考訳（メタデータ） (2025-02-27T18:16:47Z)
IOLBENCH: Benchmarking LLMs on Linguistic Reasoning [8.20398036986024]
IOL(International Linguistics Olympiad)問題に基づく新しいベンチマークであるIOLBENCHを紹介する。このデータセットは、文法、形態学、音韻学、意味論をテストする様々な問題を含んでいる。最も先進的なモデルでさえ、言語的な複雑さの複雑さを扱うのに苦労している。
論文参考訳（メタデータ） (2025-01-08T03:15:10Z)
One Mind, Many Tongues: A Deep Dive into Language-Agnostic Knowledge Neurons in Large Language Models [19.58983929459173]
大規模言語モデル(LLM)は大規模コーパスでの自己教師付き事前学習を通じて、膨大な量の事実知識を学習してきた。 LLMはまた、学習した知識を複数の言語で表現できる優れた多言語機能を示した。
論文参考訳（メタデータ） (2024-11-26T13:03:49Z)
The LLM Language Network: A Neuroscientific Approach for Identifying Causally Task-Relevant Units [16.317199232071232]
大規模言語モデル(LLM)は、言語タスクだけでなく、言語的でない様々なタスクにも顕著な能力を示す。人間の脳では、神経科学は言語処理を選択的に因果的にサポートするコア言語システムを特定している。言語選択単位を18のLLMで同定し、神経科学で用いられるのと同じ局所化手法を用いて同定する。
論文参考訳（メタデータ） (2024-11-04T17:09:10Z)
Converging to a Lingua Franca: Evolution of Linguistic Regions and Semantics Alignment in Multilingual Large Language Models [11.423589362950812]
大規模言語モデル(LLM)は、特に多言語文脈において顕著な性能を示した。近年の研究では、LLMは、ある言語で学んだスキルを他の言語に伝達することができることが示唆されているが、この能力の背後にある内部メカニズムはいまだ不明である。本稿では,LLMの内部動作に関する知見を提供し,言語間能力の向上のための基盤を提供する。
論文参考訳（メタデータ） (2024-10-15T15:49:15Z)
Lens: Rethinking Multilingual Enhancement for Large Language Models [70.85065197789639]
Lensは、大規模言語モデル(LLM)の多言語機能を強化する新しいアプローチである LLMの上位層から言語に依存しない、言語固有のサブ空間内の隠された表現を操作できる。既存のポストトレーニング手法に比べて計算資源がはるかに少ないため、優れた結果が得られる。
論文参考訳（メタデータ） (2024-10-06T08:51:30Z)
Language-Specific Neurons: The Key to Multilingual Capabilities in Large Language Models [117.20416338476856]
大規模言語モデル(LLM)は、特別にキュレートされた多言語並列コーパスで事前訓練されることなく、顕著な多言語機能を示す。 LLM内の言語特異的ニューロンを識別するための新しい検出手法である言語アクティベーション確率エントロピー(LAPE)を提案する。以上の結果から,LLMが特定の言語を処理できる能力は,神経細胞のサブセットが少なすぎるためであることが示唆された。
論文参考訳（メタデータ） (2024-02-26T09:36:05Z)
Unveiling Linguistic Regions in Large Language Models [49.298360366468934]
大規模言語モデル (LLM) は言語間アライメントと一般化能力を示す。本稿では,LLMの言語能力に関するいくつかの調査を行う。
論文参考訳（メタデータ） (2024-02-22T16:56:13Z)
Quantifying the Dialect Gap and its Correlates Across Languages [69.18461982439031]
この研究は、明らかな相違を明らかにし、マインドフルなデータ収集を通じてそれらに対処する可能性のある経路を特定することによって、方言NLPの分野を強化する基盤となる。
論文参考訳（メタデータ） (2023-10-23T17:42:01Z)
Dissociating language and thought in large language models [52.39241645471213]
大規模言語モデル(LLM)は、人間の言語を習得する上で、今までに最も近いモデルである。我々は、この区別を人間の神経科学に根ざし、形式的、機能的な能力は異なる神経機構に依存していることを示した。 LLMは形式的能力は驚くほど優れているが、機能的能力のタスクにおける性能はいまだに不明瞭である。
論文参考訳（メタデータ） (2023-01-16T22:41:19Z)

関連論文リストは本サイト内にある論文のタイトル・アブストラクトから自動的に作成しています。

指定された論文の情報です。
本サイトの運営者は本サイト（すべての情報・翻訳含む）の品質を保証せず、本サイト（すべての情報・翻訳含む）を使用して発生したあらゆる結果について一切の責任を負いません。