Strong inductive biases provably prevent harmless interpolation
- URL: http://arxiv.org/abs/2301.07605v1
- Date: Wed, 18 Jan 2023 15:37:11 GMT
- Title: Strong inductive biases provably prevent harmless interpolation
- Authors: Michael Aerni, Marco Milanta, Konstantin Donhauser, Fanny Yang
- Abstract summary: This paper argues that the degree to which is harmless hinges upon the strength of an estimator's inductive bias.
Our main theoretical result establishes tight non-asymptotic bounds for high-dimensional kernel regression.
- Score: 8.946655323517092
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Classical wisdom suggests that estimators should avoid fitting noise to
achieve good generalization. In contrast, modern overparameterized models can
yield small test error despite interpolating noise -- a phenomenon often called
"benign overfitting" or "harmless interpolation". This paper argues that the
degree to which interpolation is harmless hinges upon the strength of an
estimator's inductive bias, i.e., how heavily the estimator favors solutions
with a certain structure: while strong inductive biases prevent harmless
interpolation, weak inductive biases can even require fitting noise to
generalize well. Our main theoretical result establishes tight non-asymptotic
bounds for high-dimensional kernel regression that reflect this phenomenon for
convolutional kernels, where the filter size regulates the strength of the
inductive bias. We further provide empirical evidence of the same behavior for
deep neural networks with varying filter sizes and rotational invariance.
Related papers
- Towards Exact Computation of Inductive Bias [8.988109761916379]
We propose a novel method for efficiently computing the inductive bias required for generalization on a task.
We show that higher dimensional tasks require greater inductive bias.
Our proposed inductive bias metric provides an information-theoretic interpretation of the benefits of specific model architectures.
arXiv Detail & Related papers (2024-06-22T21:14:24Z) - Benign overfitting in Fixed Dimension via Physics-Informed Learning with Smooth Inductive Bias [8.668428992331808]
We develop an Sobolev norm learning curve for kernel ridge(less) regression when addressing (elliptical) linear inverse problems.
Our results show that the PDE operators in the inverse problem can stabilize the variance and even behave benign overfitting for fixed-dimensional problems.
arXiv Detail & Related papers (2024-06-13T14:54:30Z) - Generalization in Kernel Regression Under Realistic Assumptions [41.345620270267446]
We provide rigorous bounds for common kernels and for any amount of regularization, noise, any input dimension, and any number of samples.
Our results imply benign overfitting in high input dimensions, nearly tempered overfitting in fixed dimensions, and explicit convergence rates for regularized regression.
As a by-product, we obtain time-dependent bounds for neural networks trained in the kernel regime.
arXiv Detail & Related papers (2023-12-26T10:55:20Z) - Benign Overfitting in Deep Neural Networks under Lazy Training [72.28294823115502]
We show that when the data distribution is well-separated, DNNs can achieve Bayes-optimal test error for classification.
Our results indicate that interpolating with smoother functions leads to better generalization.
arXiv Detail & Related papers (2023-05-30T19:37:44Z) - Bias in Pruned Vision Models: In-Depth Analysis and Countermeasures [93.17009514112702]
Pruning, setting a significant subset of the parameters of a neural network to zero, is one of the most popular methods of model compression.
Despite existing evidence for this phenomenon, the relationship between neural network pruning and induced bias is not well-understood.
arXiv Detail & Related papers (2023-04-25T07:42:06Z) - Self-supervised debiasing using low rank regularization [59.84695042540525]
Spurious correlations can cause strong biases in deep neural networks, impairing generalization ability.
We propose a self-supervised debiasing framework potentially compatible with unlabeled samples.
Remarkably, the proposed debiasing framework significantly improves the generalization performance of self-supervised learning baselines.
arXiv Detail & Related papers (2022-10-11T08:26:19Z) - Fast rates for noisy interpolation require rethinking the effects of
inductive bias [8.946655323517092]
Good generalization performance on high-dimensional data hinges on a simple structure of the ground truth and a strong inductive bias of the estimator.
Our results suggest that, while a stronger inductive bias encourages a simpler structure that is more aligned with the ground truth, it also increases the detrimental effect of noise.
arXiv Detail & Related papers (2022-03-07T18:44:47Z) - Benign Overfitting in Adversarially Robust Linear Classification [91.42259226639837]
"Benign overfitting", where classifiers memorize noisy training data yet still achieve a good generalization performance, has drawn great attention in the machine learning community.
We show that benign overfitting indeed occurs in adversarial training, a principled approach to defend against adversarial examples.
arXiv Detail & Related papers (2021-12-31T00:27:31Z) - The Interplay Between Implicit Bias and Benign Overfitting in Two-Layer
Linear Networks [51.1848572349154]
neural network models that perfectly fit noisy data can generalize well to unseen test data.
We consider interpolating two-layer linear neural networks trained with gradient flow on the squared loss and derive bounds on the excess risk.
arXiv Detail & Related papers (2021-08-25T22:01:01Z) - Interpolation can hurt robust generalization even when there is no noise [76.3492338989419]
We show that avoiding generalization through ridge regularization can significantly improve generalization even in the absence of noise.
We prove this phenomenon for the robust risk of both linear regression and classification and hence provide the first theoretical result on robust overfitting.
arXiv Detail & Related papers (2021-08-05T23:04:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.