Fugu-MT 論文翻訳(概要): Rényi Sharpness: A Novel Sharpness that Strongly Correlates with Generalization

論文の概要: Rényi Sharpness: A Novel Sharpness that Strongly Correlates with Generalization

arxiv url: http://arxiv.org/abs/2510.07758v1
Date: Thu, 09 Oct 2025 03:58:21 GMT
ステータス: 翻訳完了
システム内更新日: 2025-10-10 17:54:14.855072
Title: Rényi Sharpness: A Novel Sharpness that Strongly Correlates with Generalization
Title（参考訳）: Rényi Sharpness: 一般化と強く相関する新しいシャープネス
Authors: Qiaozhe Zhang, Jun Sun, Ruijie Zhang, Yingzhuang Liu,
Abstract要約: 我々は,損失ヘッセンの負のR'enyiエントロピー(古典的なシャノンエントロピーの一般化)として定義される,新しいシャープネス尺度(textitR'enyi sharpness)を提案する。一般化と(R'enyi)シャープネスの関係を厳密に確立するために、R'enyiシャープネスという観点からいくつかの一般化境界を提供する。 R'enyiのシャープネスと一般化の間の強い相関(具体的にはケンドールのランク相関)を検証する実験を行った。
参考スコア（独自算出の注目度）: 7.429398847018864
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Sharpness (of the loss minima) is a common measure to investigate the generalization of neural networks. Intuitively speaking, the flatter the landscape near the minima is, the better generalization might be. Unfortunately, the correlation between many existing sharpness measures and the generalization is usually not strong, sometimes even weak. To close the gap between the intuition and the reality, we propose a novel sharpness measure, i.e., \textit{R\'enyi sharpness}, which is defined as the negative R\'enyi entropy (a generalization of the classical Shannon entropy) of the loss Hessian. The main ideas are as follows: 1) we realize that \textit{uniform} (identical) eigenvalues of the loss Hessian is most desirable (while keeping the sum constant) to achieve good generalization; 2) we employ the \textit{R\'enyi entropy} to concisely characterize the extent of the spread of the eigenvalues of loss Hessian. Normally, the larger the spread, the smaller the (R\'enyi) entropy. To rigorously establish the relationship between generalization and (R\'enyi) sharpness, we provide several generalization bounds in terms of R\'enyi sharpness, by taking advantage of the reparametrization invariance property of R\'enyi sharpness, as well as the trick of translating the data discrepancy to the weight perturbation. Furthermore, extensive experiments are conducted to verify the strong correlation (in specific, Kendall rank correlation) between the R\'enyi sharpness and generalization. Moreover, we propose to use a variant of R\'enyi Sharpness as regularizer during training, i.e., R\'enyi Sharpness Aware Minimization (RSAM), which turns out to outperform all existing sharpness-aware minimization methods. It is worthy noting that the test accuracy gain of our proposed RSAM method could be as high as nearly 2.5\%, compared against the classical SAM method.
Abstract（参考訳）: 損失最小値のシャープネスは、ニューラルネットワークの一般化を研究するための一般的な尺度である。直感的には、ミニマの近くにある風景が平らになればなるほど、より一般化されるかもしれない。残念ながら、多くの既存のシャープネス測度と一般化の相関は通常強くなく、時には弱くなる。直観と現実のギャップを埋めるために、ロス・ヘッセンの負のR'enyiエントロピー(古典的なシャノンエントロピーの一般化)として定義される新しいシャープネス測度、すなわち \textit{R\'enyi sharpness} を提案する。主な考え方は以下の通りである。 1) 損失 Hessian の \textit{uniform} (恒等的) 固有値が(和定数を維持しながら)良い一般化を達成するのに最も望ましいことに気づく。 2) 損失ヘッセンの固有値の拡散の程度を簡潔に特徴づけるために, textit{R\'enyi entropy} を用いる。通常、拡散が大きいほど、(R\'enyi)エントロピーが小さくなる。一般化と(R'enyi)のシャープネスの関係を厳密に確立するために、R'enyiのシャープネスの再パラメータ化不変性を生かして、R'enyiのシャープネスの項におけるいくつかの一般化境界を提供する。さらに、R'enyiのシャープネスと一般化の間の強い相関(具体的にはケンドールのランク相関)を検証するために広範な実験が行われた。さらに,R'enyi Sharpness Aware Minimization (RSAM) の学習において,R'enyi Sharpnessの変種を正則化器として用いることを提案する。提案手法の精度向上は古典的SAM法に比べて2.5倍近く高いことが注目に値する。

論文の概要: Rényi Sharpness: A Novel Sharpness that Strongly Correlates with Generalization

関連論文リスト