Fugu-MT 論文翻訳(概要): Human-like Working Memory Interference in Large Language Models

論文の概要: Human-like Working Memory Interference in Large Language Models

arxiv url: http://arxiv.org/abs/2604.09670v1
Date: Wed, 01 Apr 2026 17:19:46 GMT
ステータス: 翻訳完了
システム内更新日: 2026-04-19 19:09:11.611164
Title: Human-like Working Memory Interference in Large Language Models
Title（参考訳）: 大規模言語モデルにおけるヒューマンライクなワーキングメモリ干渉
Authors: Hua-Dong Xiong, Li Ji-An, Jiaqi Huang, Robert C. Wilson, Kwonjoon Lee, Xue-Xin Wei,
Abstract要約: 作業記憶は人間の推論と知性の基本である。 1000億のニューロンがあるにもかかわらず、生体系と人工系は共にワーキングメモリに制限がある。動作中のメモリタスクを完璧に解くために、2層トランスフォーマーをトレーニングすることはできるが、様々な事前訓練されたLCMは動作中のメモリ制限を示し続けている。
参考スコア（独自算出の注目度）: 13.786393462852395
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Intelligent systems must maintain and manipulate task-relevant information online to adapt to dynamic environments and changing goals. This capacity, known as working memory, is fundamental to human reasoning and intelligence. Despite having on the order of 100 billion neurons, both biological and artificial systems exhibit limitations in working memory. This raises a key question: why do large language models (LLMs) show such limitations, given that transformers have full access to prior context through attention? We find that although a two-layer transformer can be trained to solve working memory tasks perfectly, a diverse set of pretrained LLMs continues to show working memory limitations. Notably, LLMs reproduce interference signatures observed in humans: performance degrades with increasing memory load and is biased by recency and stimulus statistics. Across models, stronger working memory capacity correlates with broader competence on standard benchmarks, mirroring its link to general intelligence in humans. Yet despite substantial variability in working memory performance, LLMs surprisingly converge on a common computational mechanism. Rather than directly copying the relevant memory item from context, models encode multiple memory items in entangled representations, such that successful recall depends on interference control -- actively suppressing task-irrelevant content to isolate the target for readout. Moreover, a targeted intervention that suppresses stimulus content information improves performance, providing causal support for representational interference. Together, these findings identify representational interference as a core constraint on working memory in pretrained LLMs, suggesting that working-memory limits in biological and artificial systems may reflect a shared computational challenge: selecting task-relevant information under interference.
Abstract（参考訳）: インテリジェントシステムは、動的環境に適応し、目標を変更するために、オンラインのタスク関連情報を保守し、操作する必要がある。この能力は「ワーキングメモリ」と呼ばれ、人間の推論と知性の基本である。 1000億のニューロンがあるにもかかわらず、生体系と人工系は共にワーキングメモリに制限がある。なぜ大きな言語モデル(LLM)がそのような制限を示すのか? 動作中のメモリタスクを完璧に解くために、2層トランスフォーマーをトレーニングすることはできるが、様々な事前訓練されたLCMは動作中のメモリ制限を示し続けている。特に、LLMは人間の観察する干渉シグナルを再現する: 性能は記憶負荷の増加とともに低下し、相対性や刺激の統計に偏っている。モデル全体では、より強力なワーキングメモリ容量は、標準ベンチマークの幅広い能力と相関し、人間の汎用インテリジェンスとの関係を反映している。しかし、ワーキングメモリ性能のかなりの変動にもかかわらず、LCMは驚くほど共通の計算機構に収束する。コンテキストから関連するメモリアイテムを直接コピーするのではなく、モデルが複数のメモリアイテムを絡み合った表現でエンコードする。さらに、刺激内容情報を抑制する目的の介入により性能が向上し、表現的干渉に対する因果的支援が提供される。これらの結果から, 生体・人工システムの作業記憶限界は, 干渉下でのタスク関連情報の選択という, 共有された計算課題を反映している可能性が示唆された。

論文の概要: Human-like Working Memory Interference in Large Language Models

関連論文リスト