Fugu-MT 論文翻訳(概要): Diffusion In Diffusion: Breaking the Autoregressive Bottleneck in Block Diffusion Models

論文の概要: Diffusion In Diffusion: Breaking the Autoregressive Bottleneck in Block Diffusion Models

arxiv url: http://arxiv.org/abs/2601.13599v1
Date: Tue, 20 Jan 2026 05:00:26 GMT
ステータス: 翻訳完了
システム内更新日: 2026-01-21 22:47:23.159221
Title: Diffusion In Diffusion: Breaking the Autoregressive Bottleneck in Block Diffusion Models
Title（参考訳）: 拡散における拡散:ブロック拡散モデルにおける自己回帰型ボトルネックの破断
Authors: Linrui Ma, Yufei Cui, Kai Han, Yunhe Wang,
Abstract要約: 半自己回帰的パラダイムとして機能するブロック拡散言語モデルは、自己回帰的パラダイムと拡散的パラダイムの両方の長所を組み合わせる。彼らの厳密な一方向ブロック依存は、拡散モデルが有名であるグローバルな計画能力の不可逆性を導入し、犠牲にする。本稿では,ブロック拡散モデルに固有の不可逆性とミオピア問題を克服するために,ディフュージョン・イン・ディフュージョン(Diffusion in Diffusion)を提案する。
参考スコア（独自算出の注目度）: 26.45111031153368
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Block diffusion language models, operating as semi-autoregressive paradigms, combine the strengths of both autoregressive and diffusion paradigms. However, their strict unidirectional block dependencies introduce irreversibility and sacrifice the global planning capabilities for which diffusion models are renowned. In order to address these issues, we propose Diffusion in Diffusion, a draft-then-refine framework designed to overcome the irreversibility and myopia problems inherent in block diffusion models. Our approach first employs block diffusion to generate rapid drafts using small blocks, then refines these drafts through global bidirectional diffusion with a larger bidirectional receptive field. We utilise snapshot confidence remasking to identify the most critical tokens that require modification, and apply mix-scale training to expand the block diffusion model's global capabilities. Empirical results demonstrate that our approach sets a new benchmark for discrete diffusion models on the OpenWebText dataset. Using just 26% of the fine-tuning budget of baseline models, we reduce generative perplexity from 25.7 to 21.9, significantly narrowing the performance gap with autoregressive models.
Abstract（参考訳）: 半自己回帰的パラダイムとして機能するブロック拡散言語モデルは、自己回帰的パラダイムと拡散的パラダイムの両方の長所を組み合わせる。しかし、その厳密な一方向ブロック依存は、拡散モデルが有名である世界的な計画能力を犠牲にし、不可逆性をもたらす。これらの問題に対処するために,ブロック拡散モデルに固有の不可逆性とミオピア問題に対処するために,Diffusion in Diffusionを提案する。提案手法では,まずブロック拡散を用いて小さなブロックを用いて高速なドラフトを生成する。我々は、スナップショットの信頼回復を利用して、修正を必要とする最も重要なトークンを特定し、ブロック拡散モデルのグローバル機能を拡張するためにミックススケールトレーニングを適用する。実験により,提案手法はOpenWebTextデータセット上に離散拡散モデルのための新しいベンチマークを設定できることが実証された。ベースラインモデルの微調整予算の26%しか使わず、生成の難易度を25.7から21.9に減らし、自動回帰モデルのパフォーマンスギャップを著しく狭める。

論文の概要: Diffusion In Diffusion: Breaking the Autoregressive Bottleneck in Block Diffusion Models

関連論文リスト