Fugu-MT 論文翻訳(概要): Before and After Temperature: A Distributional View of Creative LLM Generation

論文の概要: Before and After Temperature: A Distributional View of Creative LLM Generation

arxiv url: http://arxiv.org/abs/2606.01451v1
Date: Sun, 31 May 2026 21:13:47 GMT
ステータス: 翻訳完了
システム内更新日: 2026-06-02 21:34:29.710347
Title: Before and After Temperature: A Distributional View of Creative LLM Generation
Title（参考訳）: 前・後温度:創造的LCM生成の分布
Authors: V. S. Raghu Parupudi, Harsha Ponnada, Aditi Kaushal, S. Shria Parupudi, Saiteja Dasari, Sahiti Bulusu,
Abstract要約: パイプラインでは、より強力なシグナルが1歩早く存在することが示されています。 Llama-3.1-8B-Instruct generations of 500 open-ended creative prompts at 0.3, 0.8, 1.5$, a single-token feature predicts the within-prompt creative rank。機械的には、勝利はインコヒーレンス体制の鋭い分布的な署名から生まれる。
参考スコア（独自算出の注目度）: 0.6927055673104934
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Reference-free evaluation of large language model (LLM) creativity relies on perplexity, entropy, and top-1 margin. We show that a much stronger signal lives one step earlier in the pipeline: in how sampling temperature \emph{reshapes} the model's token distribution before the next token is drawn. On Llama-3.1-8B-Instruct generations of 500 open-ended creative prompts at $T \in \{0.3, 0.8, 1.5\}$, a single per-token feature derived from this reshaping predicts the within-prompt creativity rank at Spearman $ρ{=}0.918$ against an averaged gpt-4o\,/\,gemini-2.5-pro judge ($n{=}500$) and $ρ{=}0.870$ against a three-rater human-majority ranking ($n{=}150$). Each of four standard reference-free baselines (self-perplexity, mean predictive entropy, top-1 margin, gzip compression ratio) tops out at $|ρ|\!\approx\!0.76$ on both ground truths: a gap of $+0.165$ on averaged-LLM and $+0.110$ on human-majority, both far larger than the spread among the baselines themselves. The two ground-truth panels agree with each other at $ρ{=}0.83$, above the inter-human ceiling of $ρ{=}0.77$, so the comparison is not bottlenecked by judge noise. Mechanistically, the win comes from a sharp distributional signature of the incoherence regime: at $T{=}1.5$ the cumulative-mass width $n_{95}(q)$ inflates from $\sim\!1$ to ${\sim}\!131$ tokens and post-temperature mass leaks off the pre-temperature top-$90\%$ plausible set by about $13$ percentage points. The per-token aggregates do not separate $T{=}0.8$ from $T{=}0.3$; discriminating the two coherent regimes is left to sequence-level features.
Abstract（参考訳）: 大規模言語モデル(LLM)のクリエイティビティの基準のない評価は、パープレキシティ、エントロピー、トップ1マージンに依存している。サンプル温度 \emph{reshapes} において、次のトークンが引き出される前にモデルのトークン分布をサンプリングする方法を示す。 Llama-3.1-8B-Instruct generation of 500 open-ended creative prompts at $T \in \{0.3, 0.8, 1.5\}$, a single per-token feature derived from this reshaping predicts the within-prompt creative rank at Spearman $ρ{=}0.918$ against a averaged gpt-4o\,/\,gemini-2.5-pro judge ($n{=}500$) and $ρ{=}0.870$ against a three-rater human-majority rank ($n{=}150$)。 4つの標準基準自由ベースライン(自己パープレキシティ、平均予測エントロピー、トップ1マージン、gzip圧縮比)は、それぞれ$|ρ|\! \approx\! 平均的なLLMで$+0.165ドル、人間大で$+0.110ドルという差は、どちらもベースライン自体の広がりよりもはるかに大きい。 2つの地平線パネルは、$ρ{=}0.83$で互いに一致し、$ρ{=}0.77$の人間間天井の上である。機械的には、この勝利は非コヒーレンス状態の鋭い分布のシグネチャから来る: at $T{=}1.5$ the cumulative-mass width $n_{95}(q)$ inflates from $\sim\! 1ドルから${\sim}\! 131ドル(約1,300円)のトークンとポスト温度のマスリークは、温度前のトップ90\%のプラウシブルセットを約13ドル(約1,300円)のポイントでリークする。個々の集合は$T{=}0.8$と$T{=}0.3$を区別しない。

関連論文リスト

Trade-off Functions for DP-SGD with Subsampling based on Random Shuffling: Tight Upper and Lower Bounds [7.787109481104569]
ランダムシャッフルに基づくサブサンプリングによるDP-SGDのトレードオフ関数の厳密な解析を導出する。 Berry-Esseenの定理によって導かれる我々の具体的な境界は、証明フレームワーク内の定数要素に密着している。
論文参考訳（メタデータ） (2026-05-07T13:35:43Z)
Bond-dimension scaling of a local-refinement advantage over hyperoptimized tensor-network contraction on Sycamore like topologies [0.0]
我々は,コテングラテンソル-ネットワーク収縮パイプラインにおける局所再分極の欠如を同定した。我々は、その影響がシカモア型トポロジーの直交性グラフ上の結合次元とともに単調に増加することを示す。
論文参考訳（メタデータ） (2026-04-28T11:59:31Z)
Rényi exponent landscape of multipartite entanglement in free-fermion systems [51.56484100374058]
我々は、Rényi tripartite information $I_3() が小フェルミ運動量での質的に $exclusion-dependent scaling を示すことを示した。 I_m(n)/I_m(1) sim zm-1 to 0$ for all integer $n geq 2$, so the leading von Neumann signal can builded from integer Rényi data。
論文参考訳（メタデータ） (2026-03-09T22:27:00Z)
Numerical Fragility in Transformers: A Layer-wise Theory for Explaining, Forecasting, and Mitigating Instability [0.0]
エラーがいつどこで発生するかを予測する一階のモジュールワイズ理論を提示する。自己注意のために、3つの解釈可能な診断に分解する層間境界を導出する。また、精度と幅を意識したLayerNormインジケータ$rho_rm LN$も導入する。
論文参考訳（メタデータ） (2025-10-17T01:03:02Z)
A Black-Box Debiasing Framework for Conditional Sampling [18.132736654624058]
条件付きサンプリングはベイズ統計学および生成モデルにおける基本的な課題である。本稿では,このようなナイーブなプラグインアプローチの精度を向上させるブラックボックスデバイアス方式を提案する。
論文参考訳（メタデータ） (2025-10-13T07:11:27Z)
On the $O(\frac{\sqrt{d}}{K^{1/4}})$ Convergence Rate of AdamW Measured by $\ell_1$ Norm [52.95596504632859]
本稿では、$ell_1$ノルムで測定されたAdamWに対して、収束速度 $frac1Ksum_k=1KEleft[||nabla f(xk)||_1right]leq O(fracsqrtdCK1/4)$を確立する。結果は、二重モーメント機構を用いたAdamW変種であるNAdamWに拡張し、同じ収束率を維持していることを示す。
論文参考訳（メタデータ） (2025-05-17T05:02:52Z)
On the $O(\frac{\sqrt{d}}{T^{1/4}})$ Convergence Rate of RMSProp and Its Momentum Extension Measured by $\ell_1$ Norm [54.28350823319057]
本稿では、RMSPropとその運動量拡張を考察し、$frac1Tsum_k=1Tの収束速度を確立する。我々の収束率は、次元$d$を除くすべての係数に関して下界と一致する。収束率は$frac1Tsum_k=1Tと類似していると考えられる。
論文参考訳（メタデータ） (2024-02-01T07:21:32Z)
Nearly Horizon-Free Offline Reinforcement Learning [97.36751930393245]
S$状態、$A$アクション、計画的地平$H$で、エピソードな時間同質なMarkov決定プロセスに関するオフライン強化学習を再考する。経験的MDPを用いた評価と計画のための,約$H$自由なサンプル複雑性境界の最初の集合を得る。
論文参考訳（メタデータ） (2021-03-25T18:52:17Z)
Curse of Dimensionality on Randomized Smoothing for Certifiable Robustness [151.67113334248464]
我々は、他の攻撃モデルに対してスムースな手法を拡張することは困難であることを示す。我々はCIFARに関する実験結果を示し,その理論を検証した。
論文参考訳（メタデータ） (2020-02-08T22:02:14Z)

関連論文リストは本サイト内にある論文のタイトル・アブストラクトから自動的に作成しています。

指定された論文の情報です。
本サイトの運営者は本サイト（すべての情報・翻訳含む）の品質を保証せず、本サイト（すべての情報・翻訳含む）を使用して発生したあらゆる結果について一切の責任を負いません。