Fugu-MT 論文翻訳(概要): Mistletoe: Stealthy Acceleration-Collapse Attacks on Speculative Decoding

論文の概要: Mistletoe: Stealthy Acceleration-Collapse Attacks on Speculative Decoding

arxiv url: http://arxiv.org/abs/2605.14005v1
Date: Wed, 13 May 2026 18:11:42 GMT
ステータス: 翻訳完了
システム内更新日: 2026-05-15 21:45:34.453973
Title: Mistletoe: Stealthy Acceleration-Collapse Attacks on Speculative Decoding
Title（参考訳）: Mistletoe: 投機的デコーディングにおけるステルス的アクセラレーション-崩壊攻撃
Authors: Shuoyang Sun, Chang Da, Hao Fang, Kuofeng Gao, Xinhao Zhong, Yi Sun, Fan Mo, Shu-Tao Xia, Bin Chen,
Abstract要約: 投機的復号化は大規模言語モデル(LLM)推論を高速化する手法として広く採用されている。モデルに基づく投機的復号化における新しいメカニズムレベルの脆弱性を同定する。提案するMistletoeは、投機的復号化に対するステルス的な加速・崩壊攻撃である。
参考スコア（独自算出の注目度）: 47.771248673145614
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Speculative decoding has become a widely adopted technique for accelerating large language model (LLM) inference by drafting multiple candidate tokens and verifying them with a target model in parallel. Its efficiency, however, critically depends on the average accepted length $τ$, i.e., how many draft tokens survive each verification step. In this work, we identify a new mechanism-level vulnerability in model-based speculative decoding: the drafter is trained to approximate the target model distribution, but this approximation is inevitably imperfect. Such a drafter-target mismatch creates a hidden attack surface where small perturbations can preserve the target model's visible behavior while substantially reducing draft-token acceptability. We propose Mistletoe, a stealthy acceleration-collapse attack against speculative decoding. Mistletoe directly targets the acceptance mechanism of speculative decoding. It jointly optimizes a degradation objective that decreases drafter-target agreement and a semantic-preservation objective that constrains the target model's output distribution. To resolve the conflict between these objectives, we introduce a null-space projection mechanism, where degradation gradients are projected away from the local semantic-preserving direction, suppressing draft acceptance while minimizing semantic drift. Experiments on various speculative decoding systems show that Mistletoe substantially reduces average accepted length $τ$, collapses speedup, and lowers averaged token throughput, while preserving output quality and perplexity. Our work highlights that speculative decoding introduces a mechanism-level attack surface beyond existing output robustness, calling for more robust designs of LLM acceleration systems.
Abstract（参考訳）: 投機的復号化は、複数の候補トークンを起草し、それらを並列にターゲットモデルで検証することにより、大規模言語モデル(LLM)推論を高速化する手法として広く採用されている。しかし、その効率性は、平均許容される長さ$τ$、すなわち、各検証ステップでどれだけのドラフトトークンが生き残るかに依存する。本研究では,モデルに基づく投機的復号法において,新たなメカニズムレベルの脆弱性を同定する: 設計者は,対象モデル分布を近似するために訓練されるが,この近似は必然的に不完全である。このようなドラフト・ターゲットミスマッチは、小さな摂動がターゲットモデルの可視性を維持しつつ、ドラフト・ツー・ケンの受容性を著しく低減できる隠れた攻撃面を生成する。提案するMistletoeは、投機的復号化に対するステルス的な加速・崩壊攻撃である。 Mistletoeは投機的復号化の受け入れメカニズムを直接ターゲットとしている。これは、ドラフト・ターゲット合意を減少させる劣化目標と、目標モデルの出力分布を制限する意味保存目標を共同で最適化する。これらの目的の対立を解決するため、局所的な意味保存方向から劣化勾配を投影するヌル空間投影機構を導入し、意味的ドリフトを最小限に抑えながらドラフト受け入れを抑える。様々な投機的復号システムの実験により、Mistletoeは出力品質とパープレキシティを保ちながら、平均許容長をτ$で大幅に削減し、スピードアップを崩壊させ、平均トークンスループットを低下させることが示されている。我々の研究は、投機的復号化が既存の出力の堅牢性を超えた機構レベルの攻撃面を導入し、LCM加速システムのより堅牢な設計を要求することを強調している。

論文の概要: Mistletoe: Stealthy Acceleration-Collapse Attacks on Speculative Decoding

関連論文リスト