Fugu-MT 論文翻訳(概要): When to Commit? Towards Variable-Size Self-Contained Blocks for Discrete Diffusion Language Models

論文の概要: When to Commit? Towards Variable-Size Self-Contained Blocks for Discrete Diffusion Language Models

arxiv url: http://arxiv.org/abs/2604.23994v1
Date: Mon, 27 Apr 2026 03:21:07 GMT
ステータス: 翻訳完了
システム内更新日: 2026-04-28 17:12:07.717093
Title: When to Commit? Towards Variable-Size Self-Contained Blocks for Discrete Diffusion Language Models
Title（参考訳）: いつコミットすべきか?離散拡散言語モデルのための可変サイズ自己完結ブロックを目指して
Authors: Danny Wang, Ruihong Qiu, Zi Huang,
Abstract要約: ブロックコミットメントの原則的基準として自己完結性を提案する。ブロックは、予測がFuture-Aware(FA)またはNo-Future(NF)と整合性を維持している場合、自己完結する。我々は,自己完結性を予測整合性に結びつける理論的正当性を提供し,VSBの有効性を検証した広範囲な実験を行った。
参考スコア（独自算出の注目度）: 36.08108046941572
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Discrete diffusion language models (dLLMs) enable parallel token updates with bidirectional attention, yet practical generation typically adopts blockwise semi-autoregressive decoding. This switch creates a training-inference mismatch: training denoises with full-sequence context, while inference commits tokens within a bounded block without future context. Therefore, decoding with fixed-size or heuristic-based blocks can lead to premature token commitments, as decisions are made without full access to future context that could alter those choices. Motivated by this, we propose self-containedness as a principled criterion for block commitment. A block is self-contained if its predictions remain consistent with Future-Aware (FA) or without No-Future (NF) access to future context, reframing block boundary selection as a test of self-containedness rather than a heuristic choice. Based on this principle, we introduce Variable-size Self-contained Blocks (VSB) for dLLMs. VSB scores and selects block boundaries using the divergence between token-level predictive distributions under NF and FA conditioning, which quantifies how predictions would change if future context were revealed. We provide theoretical justification linking self-containedness to predictive consistency, and extensive experiments validate VSB's efficacy over fixed-size and heuristic blockwise decoding.
Abstract（参考訳）: 離散拡散言語モデル(dLLM)は、双方向の注意を伴う並列トークン更新を可能にするが、実用的な生成は通常、ブロックワイドな半自己回帰デコーディングを採用する。このスイッチはトレーニングと推論のミスマッチを生成する。トレーニングはフルシーケンスコンテキストで、推論は将来のコンテキストなしで境界ブロック内でトークンをコミットする。したがって、固定サイズまたはヒューリスティックベースのブロックによるデコーディングは、これらの選択を変える可能性のある将来のコンテキストに完全にアクセスせずに決定されるため、早期トークンのコミットメントにつながる可能性がある。そこで我々は,ブロックコミットメントの原則的基準として自己完結性を提案する。ブロックは、予測がFuture-Aware (FA) やNo-Future (NF) と整合性を維持したままである場合、自己完結し、ブロック境界の選択をヒューリスティックな選択ではなく自己完結性のテストとして考える。本原理に基づいて,dLLMに対する可変サイズ自己完結ブロック(VSB)を提案する。 VSB は NF と FA 条件下でのトークンレベルの予測分布のばらつきを利用してブロック境界をスコアし、選択する。我々は,自己完結性を予測整合性に結びつける理論的正当性を提供し,VSBの固定サイズおよびヒューリスティックブロックワイド復号に対する有効性を検証する広範な実験を行った。

論文の概要: When to Commit? Towards Variable-Size Self-Contained Blocks for Discrete Diffusion Language Models

関連論文リスト