Fugu-MT 論文翻訳(概要): The Disparate Impacts of Speculative Decoding

論文の概要: The Disparate Impacts of Speculative Decoding

arxiv url: http://arxiv.org/abs/2510.02128v1
Date: Thu, 02 Oct 2025 15:38:57 GMT
ステータス: 翻訳完了
システム内更新日: 2025-10-03 16:59:21.190491
Title: The Disparate Impacts of Speculative Decoding
Title（参考訳）: 投機的復号化の異なる影響
Authors: Jameson Sandler, Ahmet Üstün, Marco Romanelli, Sara Hooker, Ferdinando Fioretto,
Abstract要約: 投機的復号化(英: Speculative decoding)とは、大規模言語モデルの復号時間を体系的に短縮する手法である。この論文は、投機的復号化によって得られたスピードアップは、一様にタスクに分散せず、不適合なタスクに対して一貫して減少し、しばしば表現不足なタスクであることを示している。
参考スコア（独自算出の注目度）: 54.98795989404752
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The practice of speculative decoding, whereby inference is probabilistically supported by a smaller, cheaper, ``drafter'' model, has become a standard technique for systematically reducing the decoding time of large language models. This paper conducts an analysis of speculative decoding through the lens of its potential disparate speed-up rates across tasks. Crucially, the paper shows that speed-up gained from speculative decoding is not uniformly distributed across tasks, consistently diminishing for under-fit, and often underrepresented tasks. To better understand this phenomenon, we derive an analysis to quantify this observed ``unfairness'' and draw attention to the factors that motivate such disparate speed-ups to emerge. Further, guided by these insights, the paper proposes a mitigation strategy designed to reduce speed-up disparities and validates the approach across several model pairs, revealing on average a 12% improvement in our fairness metric.
Abstract（参考訳）: 投機的復号化の実践では,より小さく,より安価な 'drafter'' モデルによって推論が確率的に支持されるようになり,大規模言語モデルの復号化時間を体系的に短縮する標準手法となった。本稿では,タスク間での潜在的な異なるスピードアップ速度のレンズによる投機的復号化について分析する。重要なことは、投機的復号化によって得られたスピードアップは、一様にタスクに分散せず、不適合なタスクに対して一貫して減少し、しばしば表現不足なタスクであることを示している。この現象をよりよく理解するために、観測された「不公平」を定量化するための分析を導き、そのような異なるスピードアップが出現する動機となる要因に注意を向ける。さらに,これらの知見に導かれて,スピードアップの相違を低減し,複数のモデルペアにまたがるアプローチを検証するための緩和戦略を提案する。

論文の概要: The Disparate Impacts of Speculative Decoding

関連論文リスト