Fugu-MT 論文翻訳(概要): PRISM: A Geometric Risk Bound that Decomposes Drift into Scale, Shape, and Head

論文の概要: PRISM: A Geometric Risk Bound that Decomposes Drift into Scale, Shape, and Head

arxiv url: http://arxiv.org/abs/2605.11608v1
Date: Tue, 12 May 2026 06:40:34 GMT
ステータス: 翻訳完了
システム内更新日: 2026-05-13 21:48:56.635109
Title: PRISM: A Geometric Risk Bound that Decomposes Drift into Scale, Shape, and Head
Title（参考訳）: PRISM: ドリフトをスケール、形状、頭部に分解する幾何学的リスク境界
Authors: Chieh-Yen Lin, Shao-Hua Sun,
Abstract要約: 本研究では,LLMの線形出力ヘッドと背骨のほぼ等尺構造を利用したPRISMを提案する。境界は変分ランクに調整され、ドリフトを3つの独立測定可能な軸に分解する。 PRISMは、学習後の量子化において平均スピアマン相関が0.820、LoRAを忘れるために0.831の変種をランク付けする。
参考スコア（独自算出の注目度）: 14.880821907124451
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Comparing post-training LLM variants, such as quantized, LoRA-adapted, and distilled models, requires a diagnostic that identifies how a variant has drifted, not only whether it has degraded. Existing similarity scores such as CKA and SVCCA can flag degradation, but they do not directly link representation drift to risk or mechanism. We propose PRISM, Proxy Risk Inference via Structural Mapping, which exploits the linear output head of LLMs and the empirically near-isometric structure of their backbones to derive a closed-form upper bound on the cross-entropy risk gap between a target model and a post-training variant. The bound is calibrated for variant ranking and decomposes drift into three independently measurable axes: scale mismatch, shape mismatch, and head divergence. Each axis corresponds to a distinct failure mode, including shape distortion under low-bit quantization, scale separability under LoRA forgetting, and head divergence under GGUF k-quantization. As a result, the dominant axis suggests a remediation direction rather than merely raising a degradation flag. Because the shape term is differentiable, the same geometry can also serve as a training-time regularizer against catastrophic forgetting. Across two model families and five benchmarks, PRISM ranks variants with mean Spearman correlations of 0.820 for post-training quantization and 0.831 for LoRA forgetting, and its axis-guided shape regularizer outperforms experience replay in aggregate at mitigating downstream forgetting.
Abstract（参考訳）: 量子化、LoRA適応、蒸留モデルなどの後学習後のLLM変種と比較すると、変種が劣化しただけでなく、どのように漂流したかを識別する診断が必要である。 CKAやSVCCAのような既存の類似度スコアはフラグの劣化を引き起こすが、表現のドリフトをリスクやメカニズムに直接リンクしない。 PRISM, Proxy Risk Inference via Structure Mapping, which which exploits the linear output head of LLMs and the empirically near-isometric structure of their backbones to derived a closed-form upper bound on the cross-entropy risk gap between a target model and a post-training variant。境界は様々なランク付けのために調整され、ドリフトを3つの独立した測定可能な軸(スケールミスマッチ、形状ミスマッチ、頭部分散)に分解する。各軸は、低ビット量子化下での形状歪み、LoRAにおけるスケール分離性、GGUF k量子化時の頭部分散など、異なる故障モードに対応する。その結果、支配軸は単に劣化旗を掲げるのではなく、修復方向を示唆する。形状項は微分可能であるので、同じ幾何学は破滅的な忘れ物に対する訓練時正則化器としても機能する。 2つのモデルファミリーと5つのベンチマークで、PRISMは平均スピアマン相関を、トレーニング後の量子化では0.820、LoRAを忘れると0.831とランク付けし、その軸誘導型形状正規化器は、下流の忘れを緩和する際の集合的リプレイよりも優れている。

関連論文リスト

Too Correct to Learn: Reinforcement Learning on Saturated Reasoning Data [55.84428098924793]
構造保存探索を行うためのパラメータ自由復号法である Constrained Uniform Top-K Smpling (CUTS) を提案する。グループ内の利点分散を増幅するために、エクスプロイトと探索的なロールアウトを相乗化するためのトレーニングフレームワークであるMixed-CUTSに統合する。特にMixed-CUTSは、AIME25ベンチマークのPass@1の精度を標準のGRPOよりも15.1%向上している。
論文参考訳（メタデータ） (2026-04-20T16:43:28Z)
ATLAS: Constitution-Conditioned Latent Geometry and Redistribution Across Language Models and Neural Perturbation Data [0.0]
構成条件付きポストトレーニングは、モデルが学習した表現幾何学の構造化摂動として分析することができる。グラフ, モデル, 基板間の構成による隠れ状態構造をトレースする, 幾何学第一のプログラムATLASを紹介する。
論文参考訳（メタデータ） (2026-04-19T23:26:02Z)
Information-Geometric Decomposition of Generalization Error in Unsupervised Learning [0.0]
教師なし学習のKulback-Leibler一般化誤差(GE)を,モデル誤差,データバイアス,分散の3つの非負成分に分解する。分解は任意の e-フラットモデルクラスに対して完全である。
論文参考訳（メタデータ） (2026-04-14T06:23:18Z)
Convergence of Byzantine-Resilient Gradient Tracking via Probabilistic Edge Dropout [1.3902537392439644]
任意の相手メッセージを送信するビザンティンエージェントを用いたネットワーク上での分散最適化について検討する。確率的エッジドロップアウトと漏洩積分(GT-PD-L)を用いたemphGradient Trackingを提案する。 GT-PD-Lは、盗難攻撃下での座標平均を最大4.3%上回る。
論文参考訳（メタデータ） (2026-04-01T03:55:42Z)
Le Cam Distortion: A Decision-Theoretic Framework for Robust Transfer Learning [0.0]
模擬性に基づく移動リスク条件の厳密な上限としてLe Cam Distortionを導入する。我々のフレームワークは、ソースからターゲットをシミュレートするカーネルを学習することで、ソースの劣化なしに転送を可能にする。 Le Cam Distortionは、負の転送が受け入れられないドメインにおいて、リスク制御された転送学習のための最初の原則化されたフレームワークを提供する。
論文参考訳（メタデータ） (2025-12-29T17:21:44Z)
Latent Sculpting for Zero-Shot Generalization: A Manifold Learning Approach to Out-of-Distribution Anomaly Detection [2.8547732086436306]
教師付きディープラーニングの基本的限界は「一般化崩壊」である階層型2段階表現学習フレームワークであるLatent Sculptingを提案する。我々は「浸潤」のシナリオについて88.89%の検知率を報告した。
論文参考訳（メタデータ） (2025-12-19T11:37:02Z)
Distributionally Robust Models with Parametric Likelihood Ratios [123.05074253513935]
3つの単純なアイデアにより、より広いパラメトリックな確率比のクラスを用いてDROでモデルを訓練することができる。パラメトリック逆数を用いてトレーニングしたモデルは、他のDROアプローチと比較して、サブポピュレーションシフトに対して一貫して頑健であることがわかった。
論文参考訳（メタデータ） (2022-04-13T12:43:12Z)
Label Distributionally Robust Losses for Multi-class Classification: Consistency, Robustness and Adaptivity [55.29408396918968]
多クラス分類のためのラベル分布ロバスト(LDR)損失という損失関数群について検討した。我々の貢献は、多クラス分類のためのLDR損失のトップ$kの一貫性を確立することによって、一貫性と堅牢性の両方を含んでいる。本稿では,各インスタンスのクラスラベルの雑音度に個別化温度パラメータを自動的に適応させる適応型LDR損失を提案する。
論文参考訳（メタデータ） (2021-12-30T00:27:30Z)
Shaping Deep Feature Space towards Gaussian Mixture for Visual Classification [74.48695037007306]
視覚分類のためのディープニューラルネットワークのためのガウス混合損失関数(GM)を提案する。分類マージンと可能性正規化により、GM損失は高い分類性能と特徴分布の正確なモデリングの両方を促進する。提案したモデルは、追加のトレーニング可能なパラメータを使わずに、簡単かつ効率的に実装できる。
論文参考訳（メタデータ） (2020-11-18T03:32:27Z)

関連論文リストは本サイト内にある論文のタイトル・アブストラクトから自動的に作成しています。

指定された論文の情報です。
本サイトの運営者は本サイト（すべての情報・翻訳含む）の品質を保証せず、本サイト（すべての情報・翻訳含む）を使用して発生したあらゆる結果について一切の責任を負いません。