Fugu-MT 論文翻訳(概要): Locally Coherent Parallel Decoding in Diffusion Language Models

論文の概要: Locally Coherent Parallel Decoding in Diffusion Language Models

arxiv url: http://arxiv.org/abs/2603.20216v1
Date: Tue, 03 Mar 2026 09:56:53 GMT
ステータス: 翻訳完了
システム内更新日: 2026-04-06 02:36:12.897575
Title: Locally Coherent Parallel Decoding in Diffusion Language Models
Title（参考訳）: 拡散言語モデルにおける局所コヒーレント並列デコーディング
Authors: Michael Hersche, Nicolas Menet, Ronan Tanios, Abbas Rahimi,
Abstract要約: 拡散言語モデル(DLM)は、線形生成遅延と双方向機能を提供する。標準DLMは条件付き境界分布とは独立してトークンをサンプリングする。並列サンプリングを局所的依存モデルと照合する手法であるCoDiLAを紹介する。
参考スコア（独自算出の注目度）: 6.620088179445404
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Diffusion language models (DLMs) have emerged as a promising alternative to autoregressive (AR) models, offering sub-linear generation latency and bidirectional capabilities that are particularly appealing for code generation and editing. Achieving sub-linear latency in discrete DLMs requires predicting multiple tokens in parallel. However, standard DLMs sample tokens independently from conditional marginal distributions, failing to capture the joint dependencies among concurrently generated tokens. As a result, they often lead to syntactic inconsistencies and break multi-token structures. In this work, we introduce CoDiLA (Coherent Diffusion with Local Autoregression), a method that reconciles parallel sampling with local dependency modeling. Rather than forcing the DLM to resolve fine-grained syntax, CoDiLA delegates local decoding to a small, auxiliary AR model operating on the diffusion latents. This design allows for parallel block generation while ensuring sequential validity within each block and maintaining core DLM capabilities, including bidirectional modeling across blocks. We demonstrate that using a highly compact auxiliary AR model (e.g., 0.6B parameters) effectively eliminates coherence artifacts, establishing a new Pareto frontier for accuracy and speed in code generation benchmarks.
Abstract（参考訳）: 拡散言語モデル(DLMs)は、自動回帰(AR)モデルの有望な代替として登場し、特にコード生成や編集に魅力的なサブ線形生成遅延と双方向機能を提供する。離散DLMでサブ線形レイテンシを実現するには、複数のトークンを並列に予測する必要がある。しかし、標準のDLMは条件付き辺縁分布とは独立してトークンをサンプリングし、同時に生成されたトークン間の共同依存関係をキャプチャできなかった。結果として、それらはしばしば構文上の矛盾を招き、マルチトークン構造を壊す。本研究では,並列サンプリングと局所依存性モデリングを併用したCoDiLA(Coherent Diffusion with Local Autoregression)を提案する。 CoDiLAはDLMに微細な構文の解決を強制するのではなく、局所的なデコーディングを拡散潜水器で動作する小さな補助的なARモデルに委譲する。この設計により、ブロック間の双方向モデリングを含むコアDLM機能を維持しながら、各ブロック内のシーケンシャルな妥当性を確保しながら、並列ブロック生成が可能となる。我々は,高度にコンパクトな補助ARモデル(例えば0.6Bパラメータ)を使用することで,コヒーレンスアーチファクトを効果的に排除し,コード生成ベンチマークの精度と高速化のための新しいParetoフロンティアを確立することを実証した。

論文の概要: Locally Coherent Parallel Decoding in Diffusion Language Models

関連論文リスト