Fugu-MT 論文翻訳(概要): Double Descent and Overfitting under Noisy Inputs and Distribution Shift for Linear Denoisers

論文の概要: Double Descent and Overfitting under Noisy Inputs and Distribution Shift for Linear Denoisers

arxiv url: http://arxiv.org/abs/2305.17297v3
Date: Thu, 14 Mar 2024 23:02:53 GMT
ステータス: 翻訳完了
システム内更新日: 2024-03-19 07:42:00.921224
Title: Double Descent and Overfitting under Noisy Inputs and Distribution Shift for Linear Denoisers
Title（参考訳）: 雑音入力による二重発振と過度整合と線形復調器の配電シフト
Authors: Chinmaya Kausik, Kashvi Srivastava, Rishi Sonthalia,
Abstract要約: 教師付き denoising を研究する上での懸念は,テスト分布からのノイズレストレーニングデータが常に存在するとは限らないことだ。そこで本研究では,分散シフト下での教師付きノイズ除去とノイズインプット回帰について検討した。
参考スコア（独自算出の注目度）: 3.481985817302898
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Despite the importance of denoising in modern machine learning and ample empirical work on supervised denoising, its theoretical understanding is still relatively scarce. One concern about studying supervised denoising is that one might not always have noiseless training data from the test distribution. It is more reasonable to have access to noiseless training data from a different dataset than the test dataset. Motivated by this, we study supervised denoising and noisy-input regression under distribution shift. We add three considerations to increase the applicability of our theoretical insights to real-life data and modern machine learning. First, while most past theoretical work assumes that the data covariance matrix is full-rank and well-conditioned, empirical studies have shown that real-life data is approximately low-rank. Thus, we assume that our data matrices are low-rank. Second, we drop independence assumptions on our data. Third, the rise in computational power and dimensionality of data have made it important to study non-classical regimes of learning. Thus, we work in the non-classical proportional regime, where data dimension $d$ and number of samples $N$ grow as $d/N = c + o(1)$. For this setting, we derive data-dependent, instance specific expressions for the test error for both denoising and noisy-input regression, and study when overfitting the noise is benign, tempered or catastrophic. We show that the test error exhibits double descent under general distribution shift, providing insights for data augmentation and the role of noise as an implicit regularizer. We also perform experiments using real-life data, where we match the theoretical predictions with under 1\% MSE error for low-rank data.
Abstract（参考訳）: 現代の機械学習における認知論の重要性と教師付き認知論に関する経験的な研究にもかかわらず、その理論的理解はいまだに乏しい。教師付きdenoisingを研究することの1つの懸念は、テスト分布からのノイズレストレーニングデータが常に存在するとは限らないことである。テストデータセットとは異なるデータセットからノイズレストレーニングデータにアクセスするのは、より合理的である。そこで本研究では,分散シフト下での教師付きノイズ除去とノイズインプット回帰について検討した。実生活データや現代の機械学習への理論的洞察の適用性を高めるために、3つの考慮事項を追加します。第一に、過去の理論的な研究は、データ共分散行列が完全ランクで十分に条件付けされていると仮定しているが、実生活データは概して低ランクであることを示した経験的研究である。したがって、我々のデータ行列は低ランクであると仮定する。第2に、データの独立性の前提を下げます。第三に、計算力の増大とデータの次元性は、非古典的な学習体制の研究を重要視している。したがって、データ次元$d$とサンプル数$N$が$d/N = c + o(1)$として成長する非古典的比例法で作業する。この設定では, 雑音の重なりが良さ, 誘惑性, 破滅的である場合, ノイズの重なりについて検討する。テスト誤差は、一般分布シフトの下で二重降下を示し、データ拡張と暗黙の正規化器としてのノイズの役割についての洞察を提供する。また、実生活データを用いて実験を行い、低ランクデータに対する理論予測を1\% MSE誤差で一致させる。

論文の概要: Double Descent and Overfitting under Noisy Inputs and Distribution Shift for Linear Denoisers

関連論文リスト