Fugu-MT 論文翻訳(概要): Memorizing without overfitting: Bias, variance, and interpolation in over-parameterized models

論文の概要: Memorizing without overfitting: Bias, variance, and interpolation in over-parameterized models

arxiv url: http://arxiv.org/abs/2010.13933v4
Date: Thu, 24 Feb 2022 16:38:45 GMT
ステータス: 翻訳完了
システム内更新日: 2022-10-02 18:40:13.104270
Title: Memorizing without overfitting: Bias, variance, and interpolation in over-parameterized models
Title（参考訳）: 過度適合のない記憶:過度パラメータ化モデルにおけるバイアス、分散、補間
Authors: Jason W. Rocks and Pankaj Mehta
Abstract要約: バイアス分散トレードオフは教師あり学習における中心的な概念である。現代のDeep Learningメソッドは、最先端のパフォーマンスを達成するために、このドグマを浮かび上がらせる。
参考スコア（独自算出の注目度）: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: The bias-variance trade-off is a central concept in supervised learning. In classical statistics, increasing the complexity of a model (e.g., number of parameters) reduces bias but also increases variance. Until recently, it was commonly believed that optimal performance is achieved at intermediate model complexities which strike a balance between bias and variance. Modern Deep Learning methods flout this dogma, achieving state-of-the-art performance using "over-parameterized models" where the number of fit parameters is large enough to perfectly fit the training data. As a result, understanding bias and variance in over-parameterized models has emerged as a fundamental problem in machine learning. Here, we use methods from statistical physics to derive analytic expressions for bias and variance in two minimal models of over-parameterization (linear regression and two-layer neural networks with nonlinear data distributions), allowing us to disentangle properties stemming from the model architecture and random sampling of data. In both models, increasing the number of fit parameters leads to a phase transition where the training error goes to zero and the test error diverges as a result of the variance (while the bias remains finite). Beyond this threshold, the test error of the two-layer neural network decreases due to a monotonic decrease in \emph{both} the bias and variance in contrast with the classical bias-variance trade-off. We also show that in contrast with classical intuition, over-parameterized models can overfit even in the absence of noise and exhibit bias even if the student and teacher models match. We synthesize these results to construct a holistic understanding of generalization error and the bias-variance trade-off in over-parameterized models and relate our results to random matrix theory.
Abstract（参考訳）: バイアス分散トレードオフは教師付き学習の中心的な概念である。古典統計学では、モデルの複雑さ(例えばパラメータの数)が増加するとバイアスが減少するが、分散も増加する。近年まで、バイアスと分散のバランスをとる中間モデルの複雑度において最適性能が達成されると信じられていた。現代のDeep Learningメソッドは、トレーニングデータに完全に適合するのに十分な適合パラメータの数を持つ"オーバーパラメータ化モデル"を使用して、最先端のパフォーマンスを達成する。その結果、過パラメータモデルにおけるバイアスと分散の理解が機械学習の根本的な問題として浮上した。ここでは、統計物理学の手法を用いて、過パラメータ化(非線形データ分布を持つ線形回帰と2層ニューラルネットワーク)の2つの最小モデルにおけるバイアスと分散に関する分析式を導出し、モデル構造とランダムなデータのサンプリングから生じる特性を歪めることができる。両方のモデルにおいて、適合パラメータの数が増加すると、トレーニングエラーはゼロとなり、テストエラーは分散の結果(バイアスは有限だが)に分岐する相転移が起こる。このしきい値を超えると、2層ニューラルネットワークのテスト誤差は、古典的バイアス分散トレードオフとは対照的に、バイアスと分散の単調な減少により減少する。また,古典的直観とは対照的に,過度パラメータ化モデルではノイズがなくても過度に適合し,学生モデルと教師モデルが一致してもバイアスが現れることを示す。これらの結果を合成し、一般化誤差と超パラメータモデルのバイアス分散トレードオフを総合的に理解し、その結果をランダム行列理論に関連付ける。

関連論文リスト

Revisiting Optimism and Model Complexity in the Wake of Overparameterized Machine Learning [6.278498348219108]
まず、(有効)自由度という古典的な統計的概念を再解釈し、拡張することで、第一原理からモデルの複雑さを再考する。我々は,概念的議論,理論,実験の混合を通じて,提案した複雑性尺度の有用性を実証する。
論文参考訳（メタデータ） (2024-10-02T06:09:57Z)
Aliasing and Label-Independent Decomposition of Risk: Beyond the bias-variance trade-off [0.0]
データサイエンスの中心的な問題は、潜在的にノイズの多いサンプルを使用して、目に見えない入力の関数値を予測することである。一般化エイリアス分解(GAD)と呼ばれる代替パラダイムを導入する。 GADは、データラベルを見ることなく、モデルクラスとサンプルの関係から明示的に計算することができる。
論文参考訳（メタデータ） (2024-08-15T17:49:24Z)
Scaling and renormalization in high-dimensional regression [72.59731158970894]
本稿では,様々な高次元リッジ回帰モデルの訓練および一般化性能の簡潔な導出について述べる。本稿では,物理と深層学習の背景を持つ読者を対象に,これらのトピックに関する最近の研究成果の紹介とレビューを行う。
論文参考訳（メタデータ） (2024-05-01T15:59:00Z)
On the Strong Correlation Between Model Invariance and Generalization [54.812786542023325]
一般化は、見えないデータを分類するモデルの能力をキャプチャする。不変性はデータの変換におけるモデル予測の一貫性を測定する。データセット中心の視点から、あるモデルの精度と不変性は異なるテストセット上で線形に相関している。
論文参考訳（メタデータ） (2022-07-14T17:08:25Z)
Bias-variance decomposition of overparameterized regression with random linear features [0.0]
パラメータ化モデル」は、トレーニングデータに完全に適合するのに十分な数のパラメータが適合している場合でも、過度に適合しないようにする。ヘッセン行列の非零固有値が小さいため、各遷移がどのように生じるかを示す。ランダムな線形特徴モデルの位相図とランダムな非線形特徴モデルと通常の回帰とを比較して比較する。
論文参考訳（メタデータ） (2022-03-10T16:09:21Z)
Optimization Variance: Exploring Generalization Properties of DNNs [83.78477167211315]
ディープニューラルネットワーク(DNN)のテストエラーは、しばしば二重降下を示す。そこで本研究では,モデル更新の多様性を測定するために,新しい測度である最適化分散(OV)を提案する。
論文参考訳（メタデータ） (2021-06-03T09:34:17Z)
Post-mortem on a deep learning contest: a Simpson's paradox and the complementary roles of scale metrics versus shape metrics [61.49826776409194]
我々は、ニューラルネットワーク(NN)モデルの一般化精度を予測するために、コンテストで公に利用可能にされたモデルのコーパスを分析する。メトリクスが全体としてよく機能するが、データのサブパーティションではあまり機能しない。本稿では,データに依存しない2つの新しい形状指標と,一連のNNのテスト精度の傾向を予測できるデータ依存指標を提案する。
論文参考訳（メタデータ） (2021-06-01T19:19:49Z)
Understanding Double Descent Requires a Fine-Grained Bias-Variance Decomposition [34.235007566913396]
ラベルに関連付けられた用語への分散の解釈可能で対称的な分解について述べる。バイアスはネットワーク幅とともに単調に減少するが、分散項は非単調な振る舞いを示す。我々はまた、著しく豊かな現象論も分析する。
論文参考訳（メタデータ） (2020-11-04T21:04:02Z)
What causes the test error? Going beyond bias-variance via ANOVA [21.359033212191218]
現代の機械学習手法は、しばしば過度にパラメータ化され、細かいレベルでのデータへの適応を可能にする。最近の研究は、なぜ過度なパラメータ化が一般化に役立つのかをより深く理解することを目的としている。本研究では, 差分解析(ANOVA)を用いて, テスト誤差の分散を対称的に分解する手法を提案する。
論文参考訳（メタデータ） (2020-10-11T05:21:13Z)
Good Classifiers are Abundant in the Interpolating Regime [64.72044662855612]
補間分類器間のテストエラーの完全な分布を正確に計算する手法を開発した。テストエラーは、最悪の補間モデルのテストエラーから大きく逸脱する、小さな典型的な$varepsilon*$に集中する傾向にある。以上の結果から,統計的学習理論における通常の解析手法は,実際に観測された優れた一般化性能を捉えるのに十分な粒度にはならない可能性が示唆された。
論文参考訳（メタデータ） (2020-06-22T21:12:31Z)
An Investigation of Why Overparameterization Exacerbates Spurious Correlations [98.3066727301239]
この動作を駆動するトレーニングデータの2つの重要な特性を特定します。モデルの"記憶"に対する帰納的バイアスが,パラメータ化の超過を損なう可能性を示す。
論文参考訳（メタデータ） (2020-05-09T01:59:13Z)

関連論文リストは本サイト内にある論文のタイトル・アブストラクトから自動的に作成しています。

指定された論文の情報です。
本サイトの運営者は本サイト（すべての情報・翻訳含む）の品質を保証せず、本サイト（すべての情報・翻訳含む）を使用して発生したあらゆる結果について一切の責任を負いません。