Fugu-MT 論文翻訳(概要): Planned Diffusion

論文の概要: Planned Diffusion

arxiv url: http://arxiv.org/abs/2510.18087v1
Date: Mon, 20 Oct 2025 20:27:48 GMT
ステータス: 翻訳完了
システム内更新日: 2025-10-25 03:08:12.562055
Title: Planned Diffusion
Title（参考訳）: 計画拡散
Authors: Daniel Israel, Tian Jin, Ellie Cheng, Guy Van den Broeck, Aditya Grover, Suvinay Subramanian, Michael Carbin,
Abstract要約: 大きな言語モデル推論における中心的な課題は、生成速度と出力品質のトレードオフである。両パラダイムの強みを組み合わせたハイブリッド手法である計画拡散を提案する。まず、モデルが短い自己回帰計画を作成し、出力を小さく独立したスパンに分割する。
参考スコア（独自算出の注目度）: 57.74615417331808
License: http://creativecommons.org/licenses/by/4.0/
Abstract: A central challenge in large language model inference is the trade-off between generation speed and output quality. Autoregressive models produce high-quality text but generate tokens sequentially. Diffusion models can generate tokens in parallel but often need many iterations to match the same quality. We propose planned diffusion, a hybrid method that combines the strengths of both paradigms. Planned diffusion works in two stages: first, the model creates a short autoregressive plan that breaks the output into smaller, independent spans. Second, the model generates these spans simultaneously using diffusion. This approach expands the speed-quality Pareto frontier and provides a practical path to faster, high-quality text generation. On AlpacaEval, a suite of 805 instruction-following prompts, planned diffusion achieves Pareto-optimal trade-off between quality and latency, achieving 1.27x to 1.81x speedup over autoregressive generation with only 0.87\% to 5.4\% drop in win rate, respectively. Our sensitivity analysis shows that the planning mechanism of planned diffusion is minimal and reliable, and simple runtime knobs exist to provide flexible control of the quality-latency trade-off.
Abstract（参考訳）: 大きな言語モデル推論における中心的な課題は、生成速度と出力品質のトレードオフである。自己回帰モデルは高品質なテキストを生成するが、順次トークンを生成する。拡散モデルはトークンを並列に生成できるが、同じ品質に合わせるために多くのイテレーションが必要になることが多い。両パラダイムの強みを組み合わせたハイブリッド手法である計画拡散を提案する。まず、モデルが短い自己回帰計画を作成し、出力を小さく独立したスパンに分割する。第2に、モデルが拡散を用いてこれらのスパンを同時に生成する。このアプローチは、速度品質のParetoフロンティアを拡張し、高速で高品質なテキスト生成への実践的なパスを提供する。 805の命令追従プロンプトからなるAlpacaEvalでは、計画された拡散により、品質とレイテンシの間のパレート最適トレードオフが達成され、それぞれ0.87\%から5.4\%の利得で、自己回帰生成よりも1.27倍から1.81倍のスピードアップを達成した。感度分析により,計画された拡散の計画機構は最小限かつ信頼性が高く,単純なランタイムノブは品質-遅延トレードオフの柔軟な制御を実現するために存在することが示された。

論文の概要: Planned Diffusion

関連論文リスト