Fugu-MT 論文翻訳(概要): Diffusion Language Models Know the Answer Before Decoding

論文の概要: Diffusion Language Models Know the Answer Before Decoding

arxiv url: http://arxiv.org/abs/2508.19982v1
Date: Wed, 27 Aug 2025 15:40:25 GMT
ステータス: 翻訳完了
システム内更新日: 2025-08-28 19:07:41.68751
Title: Diffusion Language Models Know the Answer Before Decoding
Title（参考訳）: 拡散言語モデルはデコード前に答えを知っている
Authors: Pengxiang Li, Yefan Zhou, Dilxat Muhtar, Lu Yin, Shilin Yan, Li Shen, Yi Liang, Soroush Vosoughi, Shiwei Liu,
Abstract要約: 拡散言語モデル (DLM) は自己回帰的アプローチの代替として登場した。我々の研究は、DLMの早期回答収束の見過ごされた特性を強調し、活用する。 Prophetは、早期コミット復号を可能にするトレーニングフリーの高速復号化パラダイムである。
参考スコア（独自算出の注目度）: 56.96815863705218
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Diffusion language models (DLMs) have recently emerged as an alternative to autoregressive approaches, offering parallel sequence generation and flexible token orders. However, their inference remains slower than that of autoregressive models, primarily due to the cost of bidirectional attention and the large number of refinement steps required for high quality outputs. In this work, we highlight and leverage an overlooked property of DLMs early answer convergence: in many cases, the correct answer can be internally identified by half steps before the final decoding step, both under semi-autoregressive and random remasking schedules. For example, on GSM8K and MMLU, up to 97% and 99% of instances, respectively, can be decoded correctly using only half of the refinement steps. Building on this observation, we introduce Prophet, a training-free fast decoding paradigm that enables early commit decoding. Specifically, Prophet dynamically decides whether to continue refinement or to go "all-in" (i.e., decode all remaining tokens in one step), using the confidence gap between the top-2 prediction candidates as the criterion. It integrates seamlessly into existing DLM implementations, incurs negligible overhead, and requires no additional training. Empirical evaluations of LLaDA-8B and Dream-7B across multiple tasks show that Prophet reduces the number of decoding steps by up to 3.4x while preserving high generation quality. These results recast DLM decoding as a problem of when to stop sampling, and demonstrate that early decode convergence provides a simple yet powerful mechanism for accelerating DLM inference, complementary to existing speedup techniques. Our code is publicly available at https://github.com/pixeli99/Prophet.
Abstract（参考訳）: 拡散言語モデル(DLM)は、並列シーケンス生成とフレキシブルトークン順序を提供する自動回帰アプローチの代替として最近登場した。しかし、その推論は、主に双方向注意のコストと高品質な出力に必要な多くの改善ステップのために、自己回帰モデルよりも遅いままである。多くの場合、正解は半自己回帰とランダムリマッシングの両方のスケジュールの下で、最終復号ステップの半段前に内部的に識別することができる。例えば、GSM8KとMMLUでは、インスタンスの最大97%と99%が、精細化ステップの半分しか使用せずに正しくデコードできる。この観察に基づいて,早期コミット復号を可能にするトレーニングフリーな高速復号法であるProphetを導入する。具体的には、Prophetは、上位2の予測候補間の信頼ギャップを基準として、改良を継続するか、"オールイン"(つまり、残りのトークンを1ステップでデコードする)を動的に決定する。既存のDLM実装とシームレスに統合し、無視できるオーバーヘッドを発生させ、追加のトレーニングを必要としない。複数のタスクにわたるLLaDA-8BとDream-7Bの実証的な評価は、Prophetが高世代品質を維持しながらデコードステップの数を最大3.4倍に削減していることを示している。これらの結果から,DLMの復号化はサンプリングをいつ停止するかという問題として再考され,既存の高速化技術と相似するDLM推論を高速化する上で,早期復号化は単純かつ強力なメカニズムであることを示した。私たちのコードはhttps://github.com/pixeli99/Prophet.comで公開されています。

論文の概要: Diffusion Language Models Know the Answer Before Decoding

関連論文リスト