Fugu-MT 論文翻訳(概要): Continuous Latent Diffusion Language Model

論文の概要: Continuous Latent Diffusion Language Model

arxiv url: http://arxiv.org/abs/2605.06548v1
Date: Thu, 07 May 2026 16:44:56 GMT
ステータス: 翻訳完了
システム内更新日: 2026-05-08 22:27:12.008302
Title: Continuous Latent Diffusion Language Model
Title（参考訳）: 連続潜時拡散言語モデル
Authors: Hongcan Guo, Qinyu Zhao, Yian Zhao, Shen Nie, Rui Zhu, Qiushan Guo, Feng Wang, Tao Yang, Hengshuang Zhao, Guoqiang Wei, Yan Zeng,
Abstract要約: 大規模言語モデルは自己回帰パラダイムの下で顕著な成功を収めた。既存の代替手段は、生成効率、スケーラブルな表現学習、効果的なグローバルセマンティックモデリングを共同で達成するのに依然として苦労している。階層型情報分解によりテキスト生成をフレーム化する階層型潜在拡散言語モデルCola DLMを提案する。
参考スコア（独自算出の注目度）: 48.974403879186916
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Large language models have achieved remarkable success under the autoregressive paradigm, yet high-quality text generation need not be tied to a fixed left-to-right order. Existing alternatives still struggle to jointly achieve generation efficiency, scalable representation learning, and effective global semantic modeling. We propose Cola DLM, a hierarchical latent diffusion language model that frames text generation through hierarchical information decomposition. Cola DLM first learns a stable text-to-latent mapping with a Text VAE, then models a global semantic prior in continuous latent space with a block-causal DiT, and finally generates text through conditional decoding. From a unified Markov-path perspective, its diffusion process performs latent prior transport rather than token-level observation recovery, thereby separating global semantic organization from local textual realization. This design yields a more flexible non-autoregressive inductive bias, supports semantic compression and prior fitting in continuous space, and naturally extends to other continuous modalities. Through experiments spanning 4 research questions, 8 benchmarks, strictly matched ~2B-parameter autoregressive and LLaDA baselines, and scaling curves up to about 2000 EFLOPs, we identify an effective overall configuration of Cola DLM and verify its strong scaling behavior for text generation. Taken together, the results establish hierarchical continuous latent prior modeling as a principled alternative to strictly token-level language modeling, where generation quality and scaling behavior may better reflect model capability than likelihood, while also suggesting a concrete path toward unified modeling across discrete text and continuous modalities.
Abstract（参考訳）: 大規模言語モデルは自己回帰パラダイムの下で顕著な成功を収めてきたが、高品質なテキスト生成は固定された左から右への順序に縛られる必要はない。既存の代替手段は、生成効率、スケーラブルな表現学習、効果的なグローバルセマンティックモデリングを共同で達成するのに依然として苦労している。階層型情報分解によりテキスト生成をフレーム化する階層型潜在拡散言語モデルCola DLMを提案する。 Cola DLM は、まず Text VAE を用いて安定なテキスト-ラテントマッピングを学習し、次にブロック因果 DiT を用いて連続潜時空間のグローバルセマンティックをモデル化し、最後に条件付きデコーディングによってテキストを生成する。統一マルコフパスの観点からは、その拡散過程はトークンレベルの観察回復ではなく、遅延した事前輸送を行い、グローバルな意味的組織を局所的なテキスト的実現から分離する。この設計により、より柔軟な非自己回帰的帰納バイアスが得られ、意味的圧縮と連続空間における事前の適合をサポートし、自然に他の連続的なモダリティにまで拡張される。 4つの研究質問、8つのベンチマーク、厳密には2Bパラメータの自己回帰とLLaDAのベースライン、そして2000個のEFLOPのスケーリング曲線にまたがる実験により、Cola DLMの効率的な全体構成を特定し、テキスト生成のための強力なスケーリング挙動を検証する。結果は、厳密なトークンレベルの言語モデリングに代わる原則として、階層的連続潜在事前モデリングを確立し、生成品質とスケーリングの振る舞いは、可能性よりもモデル能力をよりよく反映し、また、離散テキストと連続モダリティをまたいだ統一モデリングへの具体的な道筋を示唆する。

論文の概要: Continuous Latent Diffusion Language Model

関連論文リスト