Fugu-MT 論文翻訳(概要): Introspective Diffusion Language Models

論文の概要: Introspective Diffusion Language Models

arxiv url: http://arxiv.org/abs/2604.11035v1
Date: Mon, 13 Apr 2026 06:01:01 GMT
ステータス: 翻訳完了
システム内更新日: 2026-04-14 20:13:16.35461
Title: Introspective Diffusion Language Models
Title（参考訳）: イントロスペクティブ拡散言語モデル
Authors: Yifan Yu, Yuqing Jian, Junxiong Wang, Zhongzhu Zhou, Donglin Zhuang, Xinyu Fang, Sri Yanamandra, Xiaoxia Wu, Qingyang Wu, Shuaiwen Leon Song, Tri Dao, Ben Athiwaratkun, James Zou, Fan Lai, Chenfeng Xu,
Abstract要約: イントロスペクティブ拡散言語モデル(Introspective Diffusion Language Model, I-DLM)は、ARトレーニングのイントロスペクティブ一貫性を継承しながら並列デコードを維持するパラダイムである。 I-DLMは、新しいintrospective strided decoding (ISD)アルゴリズムを使用しており、モデルは同じ前方パスで新しいトークンを前進させながら、以前に生成されたトークンを検証することができる。 I-DLMは、同規模のARの質に匹敵する最初のDLMであり、モデル品質と15ベンチマークでの実用効率の両方において、以前のDLMよりも優れていた。
参考スコア（独自算出の注目度）: 58.91876345013321
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Diffusion language models promise parallel generation, yet still lag behind autoregressive (AR) models in quality. We stem this gap to a failure of introspective consistency: AR models agree with their own generations, while DLMs often do not. We define the introspective acceptance rate, which measures whether a model accepts its previously generated tokens. This reveals why AR training has a structural advantage: causal masking and logit shifting implicitly enforce introspective consistency. Motivated by this observation, we introduce Introspective Diffusion Language Model (I-DLM), a paradigm that retains diffusion-style parallel decoding while inheriting the introspective consistency of AR training. I-DLM uses a novel introspective strided decoding (ISD) algorithm, which enables the model to verify previously generated tokens while advancing new ones in the same forward pass. From a systems standpoint, we build I-DLM inference engine on AR-inherited optimizations and further customize it with a stationary-batch scheduler. To the best of our knowledge, I-DLM is the first DLM to match the quality of its same-scale AR counterpart while outperforming prior DLMs in both model quality and practical serving efficiency across 15 benchmarks. It reaches 69.6 on AIME-24 and 45.7 on LiveCodeBench-v6, exceeding LLaDA-2.1-mini (16B) by more than 26 and 15 points, respectively. Beyond quality, I-DLM is designed for the growing demand of large-concurrency serving, delivering about 3x higher throughput than prior state-of-the-art DLMs.
Abstract（参考訳）: 拡散言語モデルは並列生成を約束するが、品質の自己回帰(AR)モデルにはまだ遅れがある。私たちはこのギャップを、内省的一貫性の失敗に結び付けている:ARモデルは彼らの世代と一致しているが、DLMはそうではないことが多い。モデルが以前に生成されたトークンを受け付けているかどうかを測定する。因果マスキングとロジットシフトは暗黙的に内省的一貫性を強制する。本稿では,ARトレーニングの内観的一貫性を継承しつつ,拡散スタイルの並列デコードを維持するパラダイムであるイントロスペクティブ拡散言語モデル(Introspective Diffusion Language Model, I-DLM)を紹介する。 I-DLMは、新しいintrospective strided decoding (ISD)アルゴリズムを使用しており、モデルは同じ前方パスで新しいトークンを前進させながら、以前に生成されたトークンを検証することができる。システムの観点から、ARを継承した最適化に基づいてI-DLM推論エンジンを構築し、静止バッチスケジューラでさらにカスタマイズする。我々の知る限り、I-DLMは、同規模のARの質に匹敵する最初のDLMであり、従来のDLMを15ベンチマークでモデル品質と実用的なサービス効率の両方で上回っている。 AIME-24では69.6、LiveCodeBench-v6では45.7に達し、それぞれLLaDA-2.1-mini (16B)を26点以上上回っている。品質以外にも、I-DLMは大規模コンカレンシーサービス需要の増大のために設計されており、従来の最先端DLMの約3倍のスループットを提供する。

論文の概要: Introspective Diffusion Language Models

関連論文リスト