Fugu-MT 論文翻訳(概要): Learned Relay Representations for Forward-Thinking Discrete Diffusion Models

論文の概要: Learned Relay Representations for Forward-Thinking Discrete Diffusion Models

arxiv url: http://arxiv.org/abs/2605.22967v1
Date: Thu, 21 May 2026 18:53:22 GMT
ステータス: 翻訳完了
システム内更新日: 2026-05-25 17:29:20.063529
Title: Learned Relay Representations for Forward-Thinking Discrete Diffusion Models
Title（参考訳）: 前向きの離散拡散モデルのための学習されたリレー表現
Authors: Benjamin Rozonoyer, Jacopo Minniti, Dhruvesh Patel, Neil Band, Avishek Joey Bose, Tim G. J. Rudner, Andrew McCallum,
Abstract要約: 本稿では,ラウンド間のハードリセットを回避するために,Learned Relay Representations (Relay)を提案する。 Relayは、フォワードパス間で情報を渡し、時間の経過とともに切り捨てられたバックプロパゲーションを通じてトレーニングされる、差別化可能なパートークンチャネルを導入している。 Relayは最先端の拡散言語モデル(DLM)に拡張可能であり、ブロック拡散やKVキャッシングといった技術とシームレスに互換性があることを示す。
参考スコア（独自算出の注目度）: 34.17541648016911
License: http://creativecommons.org/licenses/by/4.0/
Abstract: When Masked Diffusion Models (MDMs) generate sequences through iterative refinement, the rich internal computation over masked positions is discarded, forcing every subsequent refinement step to recompute the valuable internal information stored as model representations. To avoid a hard reset between denoising rounds, we propose Learned Relay Representations (Relay), a method that allows MDMs to be forward-thinking when denoising by explicitly learning how to propagate latent information for the benefit of future denoising steps. Relay introduces a differentiable per-token channel that passes information between forward passes and is trained via truncated backpropagation through time (BPTT). We show that this framework can be scaled to state-of-the-art Diffusion Language Models (DLMs), and is seamlessly compatible with techniques like block diffusion and KV caching. We first provide a thorough justification of the design choices in Relay on a challenging Sudoku-based planning task. We then scale Relay to Fast-dLLM v2, a state-of-the-art DLM, outperforming standard supervised finetuning on coding tasks while reducing inference latency by up to 32%. Our empirical results demonstrate that state-of-the-art DLMs can be explicitly trained to relay latent information forward across decoding steps, advancing the performance-latency Pareto frontier. We provide code for all our experiments.
Abstract（参考訳）: Masked Diffusion Models (MDM) が反復的精錬によってシーケンスを生成すると、マスクされた位置上のリッチな内部計算は破棄され、その後の精錬ステップはモデル表現として格納された貴重な内部情報を再計算せざるを得なくなる。ラウンド間の難解なリセットを回避するため,ラーニングド・リレー表現(Learninged Relay Representations, Relay)を提案する。 Relayは、フォワードパス間で情報を伝達し、時間(BPTT)を経過したバックプロパゲーションを通じてトレーニングされる、差別化可能なパートークンチャネルを導入している。このフレームワークは,最先端の拡散言語モデル (DLM) に拡張可能であり,ブロック拡散やKVキャッシングといった手法とシームレスに互換性があることを示す。まずは,Relay における設計選択の徹底的な正当性について,Sudoku をベースとした計画課題について論じる。次に、Relay to Fast-dLLM v2(最先端のDLM)をスケールし、コーディングタスクの教師付き微調整を上回り、推論遅延を最大32%削減します。我々の実証実験の結果、最先端のDLMはデコードステップをまたいで遅延情報を転送するように明示的に訓練され、パフォーマンスレイテンシのParetoフロンティアが前進することを示した。すべての実験にコードを提供しています。

論文の概要: Learned Relay Representations for Forward-Thinking Discrete Diffusion Models

関連論文リスト