Fugu-MT 論文翻訳(概要): Language Diffusion Models are Associative Memories Capable of Retrieving Unseen Data

論文の概要: Language Diffusion Models are Associative Memories Capable of Retrieving Unseen Data

arxiv url: http://arxiv.org/abs/2604.26841v1
Date: Wed, 29 Apr 2026 16:06:45 GMT
ステータス: 翻訳完了
システム内更新日: 2026-04-30 15:59:36.482519
Title: Language Diffusion Models are Associative Memories Capable of Retrieving Unseen Data
Title（参考訳）: 言語拡散モデルは、未知のデータを取得することのできる連想記憶である
Authors: Bao Pham, Mohammed J. Zaki, Luca Ambrogioni, Dmitry Krotov, Matteo Negri,
Abstract要約: 統一型離散拡散モデルが連想記憶として振る舞うことを示す。トレーニングデータセットのサイズによって制御されるシャープな記憶と一般化の遷移を同定する。
参考スコア（独自算出の注目度）: 35.34519955608767
License: http://creativecommons.org/licenses/by/4.0/
Abstract: When do language diffusion models memorize their training data, and how to quantitatively assess their true generative regime? We address these questions by showing that Uniform-based Discrete Diffusion Models (UDDMs) fundamentally behave as Associative Memories (AMs) $\textit{with emergent creative capabilities}$. The core idea of an AM is to reliably recover stored data points as $\textit{memories}$ by establishing distinct basins of attraction around them. Historically, models like Hopfield networks use an explicit energy function to guarantee these stable attractors. We broaden this perspective by leveraging the observation that energy is not strictly necessary, as basins of attraction can also be formed via conditional likelihood maximization. By evaluating token recovery of $\textit{training}$ and $\textit{test}$ examples, we identify in UDDMs a sharp memorization-to-generalization transition governed by the size of the training dataset: as it increases, basins around training examples shrink and basins around unseen test examples expand, until both later converge to the same level. Crucially, we can detect this transition using only the conditional entropy of predicted token sequences: memorization is characterized by vanishing conditional entropy, while in the generalization regime the conditional entropy of most tokens remains finite. Thus, conditional entropy offers a practical probe for the memorization-to-generalization transition in deployed models.
Abstract（参考訳）: 言語拡散モデルはトレーニングデータを記憶し、真の生成体制をどのように定量的に評価するか? 統一型離散拡散モデル(UDDM)は、基本的に、連想記憶(AM)$\textit{with emergent creative capabilities}$として振る舞う。 AMの中核となる考え方は、保存されたデータポイントを$\textit{memories}$として確実に復元することである。歴史的に、ホップフィールドネットワークのようなモデルは、これらの安定した引き付けを保証するために明示的なエネルギー関数を使用する。我々は、エネルギーが厳密には必要ないという観測を活用して、この視点を広げる。 $\textit{training}$と$\textit{test}$のトークンリカバリを評価することで、UDDMではトレーニングデータセットのサイズによって支配されるシャープな記憶から一般化への移行が特定される。メモリ化は条件エントロピーの消滅によって特徴づけられるが、一般化の過程ではほとんどのトークンの条件エントロピーは有限である。したがって、条件エントロピーは、デプロイされたモデルにおける記憶から一般化への遷移の実践的なプローブを提供する。

論文の概要: Language Diffusion Models are Associative Memories Capable of Retrieving Unseen Data

関連論文リスト