Fugu-MT 論文翻訳(概要): Rainbow Padding: Mitigating Early Termination in Instruction-Tuned Diffusion LLMs

論文の概要: Rainbow Padding: Mitigating Early Termination in Instruction-Tuned Diffusion LLMs

arxiv url: http://arxiv.org/abs/2510.03680v1
Date: Sat, 04 Oct 2025 05:24:27 GMT
ステータス: 翻訳完了
システム内更新日: 2025-10-07 16:52:59.190861
Title: Rainbow Padding: Mitigating Early Termination in Instruction-Tuned Diffusion LLMs
Title（参考訳）: レインボーパディング : 命令型拡散LDMにおける早期終了の軽減
Authors: Bumjun Kim, Dongjae Jeon, Dueun Kim, Wonje Jeung, Albert No,
Abstract要約: 拡散型大規模言語モデルは,textteos> overflow と呼ばれる重大な脆弱性を示す。 Rainbow Padding(レインボー・パディング)は、反復するtextteos>プレースホルダーを別のパディングトークンの繰り返しサイクルで置き換えるシンプルな治療法である。実験により、レインボーパディングは、早期終了を防ぐのに十分な7つのパディングトークンで、長さの堅牢性と出力品質を大幅に改善することが示された。
参考スコア（独自算出の注目度）: 10.214443153276962
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Diffusion large language models (dLLMs) have emerged as a promising alternative to autoregressive models, offering flexible generation orders and strong performance on complex reasoning tasks. However, instruction-tuned dLLMs exhibit a critical vulnerability we term \texttt{<eos>} overflow: as allocated sequence length increases, responses paradoxically become shorter, collapsing into early termination or degenerating into streams of \texttt{<eos>} tokens. Although noticed in practice, this issue has not been systematically analyzed. We trace its root cause to the dual role of \texttt{<eos>} as both termination and padding, which concentrates probability mass on \texttt{<eos>} at later positions and propagates backward to trigger early termination. To address this, we introduce Rainbow Padding, a simple remedy that replaces repeated \texttt{<eos>} placeholders with a repeating cycle of distinct padding tokens, distributing probability mass and breaking \texttt{<eos>} dominance. Experiments show that Rainbow Padding substantially improves length robustness and output quality, with as few as seven padding tokens sufficient to prevent early termination. Moreover, the method integrates efficiently into existing instruction-tuned models: LoRA fine-tuning for a single epoch on minimal data yields significant improvements, making this solution highly practical. The code is publicly available at https://github.com/quasar529/rainbow-padding.
Abstract（参考訳）: 拡散大言語モデル(dLLMs)は自己回帰モデルに代わる有望な代替として登場し、柔軟な生成順序と複雑な推論タスクの強力なパフォーマンスを提供する。しかし、命令チューニングされたdLLMは、 \texttt{<eos>} overflow という重要な脆弱性を示す:割り当てられたシーケンス長が増加するにつれて、レスポンスはパラドックス的に短くなり、早期終了に崩壊するか、あるいは \texttt{<eos>}トークンのストリームに縮退する。実際には注目されているが、この問題は体系的に分析されていない。我々は、その根本原因を、その後の位置における確率質量を集中させ、早期終了を誘発するために後方に伝播する「texttt{<eos>}」の双対的な役割に遡る。この問題を解決するために、Rainbow Paddingを紹介します。これは、繰り返し発生する \texttt{<eos>} プレースホルダーを、異なるパディングトークンの繰り返しサイクルに置き換え、確率質量を分散し、 \texttt{<eos>} 支配を破るシンプルな治療法です。実験の結果,レインボー・パディングは耐長性や出力品質を著しく向上し,早期終了を防ぐのに十分な7つのパディングトークンが得られた。さらに、この手法は、既存の命令調整モデルに効率的に統合される: 最小限のデータに対する1つのエポックの微調整は、大幅な改善をもたらし、このソリューションを極めて実用的なものにする。コードはhttps://github.com/quasar529/rainbow-padding.comで公開されている。

論文の概要: Rainbow Padding: Mitigating Early Termination in Instruction-Tuned Diffusion LLMs

関連論文リスト