Fugu-MT 論文翻訳(概要): $p$-less Sampling: A Robust Hyperparameter-Free Approach for LLM Decoding

論文の概要: $p$-less Sampling: A Robust Hyperparameter-Free Approach for LLM Decoding

arxiv url: http://arxiv.org/abs/2509.23234v2
Date: Tue, 30 Sep 2025 21:36:20 GMT
ステータス: 翻訳完了
システム内更新日: 2025-10-02 12:11:26.777077
Title: $p$-less Sampling: A Robust Hyperparameter-Free Approach for LLM Decoding
Title（参考訳）: $p$-less サンプリング: LLMデコードに対するロバストなハイパーパラメータフリーアプローチ
Authors: Runyan Tan, Shuang Wu, Phillip Howard,
Abstract要約: $p$-less サンプリングは、サンプリングに対する情報理論のアプローチであり、各デコードステップでトランケーション閾値を動的に設定する。 p$-lessが既存のサンプリング手法を一貫して上回りながら、より高い温度でテキスト品質の劣化がはるかに少ないことを示す。
参考スコア（独自算出の注目度）: 10.595336643423229
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Obtaining high-quality outputs from Large Language Models (LLMs) often depends upon the choice of a sampling-based decoding strategy to probabilistically choose the next token at each generation step. While a variety of such sampling methods have been proposed, their performance can be sensitive to the selection of hyperparameters which may require different settings depending upon the generation task and temperature configuration. In this work, we introduce $p$-less sampling: an information-theoretic approach to sampling which dynamically sets a truncation threshold at each decoding step based on the entire token probability distribution. Unlike existing methods, $p$-less sampling has no hyperparameters and consistently produces high-quality outputs as temperature increases. We provide theoretical perspectives on $p$-less sampling to ground our proposed method and conduct experiments to empirically validate its effectiveness across a range of math, logical reasoning, and creative writing tasks. Our results demonstrate how $p$-less sampling consistently outperforms existing sampling approaches while exhibiting much less degradation in text quality at higher temperature values. We further show how $p$-less achieves greater inference-time efficiency than alternative methods through lower average token sampling times and shorter generation lengths, without sacrificing accuracy. Finally, we provide analyses to highlight the benefits of $p$-less through qualitative examples, case studies, and diversity assessments.
Abstract（参考訳）: LLM(Large Language Models)から高品質な出力を得ることは、しばしば、各生成ステップで次のトークンを確率的に選択するサンプリングベースのデコード戦略の選択に依存する。様々なサンプリング手法が提案されているが、その性能は生成タスクや温度設定に応じて異なる設定を必要とするハイパーパラメータの選択に敏感である。そこで本研究では,トークンの確率分布全体に基づいて,各復号ステップにおける切り出し閾値を動的に設定する,情報理論によるサンプリング手法について紹介する。既存の方法とは異なり、$p$-less サンプリングはハイパーパラメーターを持たず、温度が上昇するにつれて常に高品質な出力を生成する。提案手法の基礎となる$p$-lessサンプリングに関する理論的視点を提供し,提案手法の有効性を様々な数学,論理的推論,創造的記述タスクで実証的に検証する実験を行う。以上の結果から,より高温でのテキスト品質の劣化を抑えつつ,既存のサンプリング手法よりも高い性能を示した。さらに、$p$-lessが平均トークンサンプリング時間と生成長を短くすることで、代替手法よりも推論時間効率が向上することを示す。最後に、質的な例、事例研究、多様性評価を通じて、$p$-lessの利点を明らかにする分析を行う。

論文の概要: $p$-less Sampling: A Robust Hyperparameter-Free Approach for LLM Decoding

関連論文リスト