Fugu-MT 論文翻訳(概要): Breaking the Factorization Barrier in Diffusion Language Models

論文の概要: Breaking the Factorization Barrier in Diffusion Language Models

arxiv url: http://arxiv.org/abs/2603.00045v1
Date: Mon, 09 Feb 2026 08:36:39 GMT
ステータス: 翻訳完了
システム内更新日: 2026-03-09 01:20:07.991152
Title: Breaking the Factorization Barrier in Diffusion Language Models
Title（参考訳）: 拡散言語モデルにおける因子化バリアの破壊
Authors: Ian Li, Zilei Shao, Benjie Wang, Rose Yu, Guy Van den Broeck, Anji Liu,
Abstract要約: ベクトル化障壁」は拡散言語モデルの効率的な並列生成を妨げる。完全分解出力分布を置き換えるための結合離散拡散法を提案する。我々は, CoDD が多種多様な言語モデルアーキテクチャをシームレスに拡張し, オーバーヘッドを無視できることを示した。
参考スコア（独自算出の注目度）: 59.946071582340146
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Diffusion language models theoretically allow for efficient parallel generation but are practically hindered by the "factorization barrier": the assumption that simultaneously predicted tokens are independent. This limitation forces a trade-off: models must either sacrifice speed by resolving dependencies sequentially or suffer from incoherence due to factorization. We argue that this barrier arises not from limited backbone expressivity, but from a structural misspecification: models are restricted to fully factorized outputs because explicitly parameterizing a joint distribution would require the Transformer to output a prohibitively large number of parameters. We propose Coupled Discrete Diffusion (CoDD), a hybrid framework that breaks this barrier by replacing the fully-factorized output distribution with a lightweight, tractable probabilistic inference layer. This formulation yields a distribution family that is significantly more expressive than standard factorized priors, enabling the modeling of complex joint dependencies, yet remains compact enough to avoid the prohibitive parameter explosion associated with full joint modeling. Empirically, CoDD seamlessly enhances diverse diffusion language model architectures with negligible overhead, matching the reasoning performance of computationally intensive Reinforcement Learning baselines at a fraction of the training cost. Furthermore, it prevents performance collapse in few-step generation, enabling high-quality outputs at significantly reduced latencies. Code available at: https://github.com/liuanji/CoDD
Abstract（参考訳）: 拡散言語モデルは理論的には効率的な並列生成を許容するが、同時に予測されるトークンが独立であるという仮定である「ファクター化障壁」によって事実上妨げられる。モデルは依存関係を逐次解決することでスピードを犠牲にするか、要因化による不整合に苦しむ必要がある。この障壁は、制限されたバックボーン表現性から生じるのではなく、構造的不特定性から生じるものであると主張する:モデルは完全に分解された出力に制限されるのは、共役分布を明示的にパラメータ化するためには、トランスフォーマーが禁断に多くのパラメータを出力する必要があるからである。提案するCoDD(Coupled Discrete Diffusion)は,完全構成の出力分布を軽量でトラクタブルな確率的推論層に置き換えることで,この障壁を突破するハイブリッドフレームワークである。この定式化は、標準的な分解前よりもはるかに表現力が高く、複雑な関節依存のモデリングを可能にするが、完全な関節モデリングに関連する禁止パラメータの爆発を避けるのに十分なコンパクトさを維持している。実証的に、CoDDは、計算集約型強化学習ベースラインの推論性能をトレーニングコストのごく一部で一致させ、オーバーヘッドを無視できる多様な拡散言語モデルアーキテクチャをシームレスに強化する。さらに、数ステップでの性能低下を防止し、高い品質の出力を著しく低減できる。 https://github.com/liuanji/CoDD

論文の概要: Breaking the Factorization Barrier in Diffusion Language Models

関連論文リスト