Fugu-MT 論文翻訳(概要): From Noise to Diversity: Random Embedding Injection in LLM Reasoning

論文の概要: From Noise to Diversity: Random Embedding Injection in LLM Reasoning

arxiv url: http://arxiv.org/abs/2605.11936v1
Date: Tue, 12 May 2026 10:47:20 GMT
ステータス: 翻訳完了
システム内更新日: 2026-05-13 21:48:56.805105
Title: From Noise to Diversity: Random Embedding Injection in LLM Reasoning
Title（参考訳）: 騒音から多様性へ:LLM推論におけるランダム埋め込み注入
Authors: Heejun Kim, Seungpil Lee, Jewon Yeom, Jaewon Sok, Seonghyeon Park, Jeongjae Park, Taesup Kim, Sundong Kim,
Abstract要約: ランダムソフト・プロンプト (RSP) について検討し, 学習段階を完全に落とし, ランダムな埋め込みベクトル列を入力に付加する。 RSPは、いくつかの設定で数学推論ベンチマークで最適化されたソフトプロンプトに匹敵する精度に達する。推定 RSP が初期トークンの多様性を上昇させ、温度サンプリングと組み合わせることで、Pass@N を拡大することにより、N のうち少なくとも 1 つが正しいことを示す。
参考スコア（独自算出の注目度）: 10.961329691434685
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Recent soft prompt research has tried to improve reasoning by inserting trained vectors into LLM inputs, yet whether the gain comes from the learned content or from the act of injection itself has not been carefully separated. We study Random Soft Prompts (RSPs), which drop the training step entirely and append a freshly drawn sequence of random embedding vectors to the input. Each RSP vector is sampled from an isotropic Gaussian fitted to the entrywise mean and variance of the pretrained embedding table; the sequence carries no learned content, and yet reaches accuracy comparable to optimized soft prompts on math reasoning benchmarks in several settings. The mechanism unfolds in two stages: because attention has to absorb a never-seen-before random position, the distribution over the first few generated tokens flattens and reasoning trajectories branch, and as generation continues this influence dilutes naturally so the response commits to a single completion. We show that during inference RSPs lift early-stage token diversity and, combined with temperature sampling, widen Pass@N, the probability that at least one out of N attempts is correct. Beyond inference, we carry the same effect into DAPO training and demonstrate practical gains. Our contributions are: (i) RSP isolates the simplest form of soft prompt -- training-free, freshly resampled -- providing a unified lens for the structural effect of injection that variants otherwise differing in training and form all share; (ii) a theoretical and empirical validation of the underlying mechanism; and (iii) an extension from inference to training.
Abstract（参考訳）: 近年のソフトプロンプト研究は、学習したベクトルをLSM入力に挿入することで推論を改善しようとしているが、学習内容から得られるのか、注射行為自体から得られるのかは慎重に分離されていない。ランダムソフト・プロンプト (RSP) について検討し, 学習段階を完全に落とし, ランダムな埋め込みベクトル列を入力に付加する。各RSPベクトルは、事前訓練された埋め込みテーブルの進入平均と分散に適合する等方的ガウス平均からサンプリングされる。この機構は2つの段階に展開される: 注意は目に見えないランダムな位置を吸収しなければならないため、最初の数個の生成したトークンの分布は平坦になり、推論軌跡が分岐し、生成が続くと、この影響は自然に希薄になり、応答は1つの完了にコミットする。推定 RSP が初期トークンの多様性を上昇させ、温度サンプリングと組み合わせることで、Pass@N を拡大することにより、N のうち少なくとも 1 つが正しいことを示す。推論以外にも、DAPOトレーニングにも同様の効果があり、実用的な成果が示されています。私たちの貢献は次のとおりです。 (i)RSPは、最も単純なソフトプロンプト -- トレーニング不要で、新しく再サンプリングされた -- を分離し、訓練において他の変種と異なり、すべての共有を形成する射出の構造的効果のための統一レンズを提供する。二基礎となる機構の理論的かつ実証的な検証三推論から訓練までの拡張

論文の概要: From Noise to Diversity: Random Embedding Injection in LLM Reasoning

関連論文リスト