Fugu-MT 論文翻訳(概要): Deferred Commitment Decoding for Diffusion Language Models with Confidence-Aware Sliding Windows

論文の概要: Deferred Commitment Decoding for Diffusion Language Models with Confidence-Aware Sliding Windows

arxiv url: http://arxiv.org/abs/2601.02076v1
Date: Mon, 05 Jan 2026 12:57:33 GMT
ステータス: 翻訳完了
システム内更新日: 2026-01-06 16:25:23.136076
Title: Deferred Commitment Decoding for Diffusion Language Models with Confidence-Aware Sliding Windows
Title（参考訳）: Windows をスライディングする信頼度を考慮した拡散言語モデルのデフレ圧縮デコーディング
Authors: Yingte Shu, Yuchuan Tian, Chao Xu, Yunhe Wang, Hanting Chen,
Abstract要約: トレーニング不要なデコード戦略として,Dederred Commitment Decoding (DCD)を提案する。 DCDは、マスクされたトークンの上に信頼性を意識したスライディングウィンドウを保持しており、十分な文脈証拠が得られるまで、高い不確実性トークンを延期しながら、早期に低不確実性トークンを解決している。実験の結果、DCDは固定ブロックベースの拡散法に比べて平均時間で1.39%向上し、最も顕著な改善は9.0%に達した。
参考スコア（独自算出の注目度）: 33.361153168706444
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Diffusion language models (DLMs) have recently emerged as a strong alternative to autoregressive models by enabling parallel text generation. To improve inference efficiency and KV-cache compatibility, prior work commonly adopts block-based diffusion, decoding tokens block by block. However, this paradigm suffers from a structural limitation that we term Boundary-Induced Context Truncation (BICT): undecoded tokens near block boundaries are forced to commit without access to nearby future context, even when such context could substantially reduce uncertainty. This limitation degrades decoding confidence and generation quality, especially for tasks requiring precise reasoning, such as mathematical problem solving and code generation. We propose Deferred Commitment Decoding (DCD), a novel, training-free decoding strategy that mitigates this issue. DCD maintains a confidence-aware sliding window over masked tokens, resolving low-uncertainty tokens early while deferring high-uncertainty tokens until sufficient contextual evidence becomes available. This design enables effective bidirectional information flow within the decoding window without sacrificing efficiency. Extensive experiments across multiple diffusion language models, benchmarks, and caching configurations show that DCD improves generation accuracy by 1.39% with comparable time on average compared to fixed block-based diffusion methods, with the most significant improvement reaching 9.0%. These results demonstrate that deferring token commitment based on uncertainty is a simple yet effective principle for improving both the quality and efficiency of diffusion language model decoding.
Abstract（参考訳）: 拡散言語モデル (DLM) は, 並列テキスト生成を可能にすることで, 自己回帰モデルの強力な代替手段として最近登場した。推論効率とKV-cacheとの互換性を改善するため、従来の作業ではブロックベースの拡散、トークンのブロック単位の復号化が一般的であった。しかし、このパラダイムは境界誘起コンテキストトラニケーション(BICT)と呼ばれる構造的制限に悩まされ、ブロック境界付近の非復号トークンは、そのようなコンテキストが不確実性を著しく減少させたとしても、近辺のコンテキストにアクセスできることなくコミットせざるを得ない。この制限は、特に数学的問題解決やコード生成といった正確な推論を必要とするタスクにおいて、復号化の信頼性と生成品質を低下させる。本稿では,この問題を緩和する新しい学習自由復号法であるDedeerred Commitment Decoding (DCD)を提案する。 DCDは、マスクされたトークンの上に信頼性を意識したスライディングウィンドウを保持しており、十分な文脈証拠が得られるまで、高い不確実性トークンを延期しながら、早期に低不確実性トークンを解決している。この設計は、効率を犠牲にすることなく、デコードウィンドウ内の効果的な双方向情報フローを可能にする。複数の拡散言語モデル、ベンチマーク、キャッシュ構成の広範な実験により、DCDは固定ブロックベースの拡散法と比較して平均時間で1.39%向上し、最も顕著な改善は9.0%に達した。これらの結果は,拡散言語モデル復号法の品質と効率を両立させる上で,不確実性に基づくトークンの復号化は単純かつ効果的な原理であることを示している。

論文の概要: Deferred Commitment Decoding for Diffusion Language Models with Confidence-Aware Sliding Windows

関連論文リスト