Universality of Benign Overfitting in Binary Linear Classification
- URL: http://arxiv.org/abs/2501.10538v1
- Date: Fri, 17 Jan 2025 20:22:06 GMT
- Title: Universality of Benign Overfitting in Binary Linear Classification
- Authors: Ichiro Hashimoto, Stanislav Volgushev, Piotr Zwiernik,
- Abstract summary: deep neural networks seem to generalize well in the over-parametrized regime.<n>It is now known that benign overfitting also occurs in various classical statistical models.
- Score: 1.6385815610837167
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The practical success of deep learning has led to the discovery of several surprising phenomena. One of these phenomena, that has spurred intense theoretical research, is ``benign overfitting'': deep neural networks seem to generalize well in the over-parametrized regime even though the networks show a perfect fit to noisy training data. It is now known that benign overfitting also occurs in various classical statistical models. For linear maximum margin classifiers, benign overfitting has been established theoretically in a class of mixture models with very strong assumptions on the covariate distribution. However, even in this simple setting, many questions remain open. For instance, most of the existing literature focuses on the noiseless case where all true class labels are observed without errors, whereas the more interesting noisy case remains poorly understood. We provide a comprehensive study of benign overfitting for linear maximum margin classifiers. We discover a phase transition in test error bounds for the noisy model which was previously unknown and provide some geometric intuition behind it. We further considerably relax the required covariate assumptions in both, the noisy and noiseless case. Our results demonstrate that benign overfitting of maximum margin classifiers holds in a much wider range of scenarios than was previously known and provide new insights into the underlying mechanisms.
Related papers
- Deep Learning is Not So Mysterious or Different [54.5330466151362]
We argue that anomalous generalization behaviour is not distinct to neural networks.
We present soft inductive biases as a key unifying principle in explaining these phenomena.
We also highlight how deep learning is relatively distinct in other ways, such as its ability for representation learning.
arXiv Detail & Related papers (2025-03-03T22:56:04Z) - Gentle robustness implies Generalization [1.2630732866686982]
We present a class of novel bounds, which are model-dependent and provably tighter than the existing robustness-based ones.<n>Unlike prior ones, our bounds are guaranteed to converge to the true error of the best classifier, as the number of samples increases.<n>We further provide an extensive experiment and find that two of our bounds are often non-vacuous for a large class of deep neural networks, pretrained from ImageNet.
arXiv Detail & Related papers (2024-12-09T10:59:39Z) - On The Relationship between Visual Anomaly-free and Anomalous Representations [0.0]
Anomaly Detection is an important problem within computer vision, having variety of real-life applications.
In this paper, we make an important hypothesis and show, by exhaustive experimentation, that the space of anomaly-free visual patterns of the normal samples correlates well with each of the various spaces of anomalous patterns of the class-specific anomaly samples.
arXiv Detail & Related papers (2024-10-09T06:18:53Z) - Centrality and Consistency: Two-Stage Clean Samples Identification for
Learning with Instance-Dependent Noisy Labels [87.48541631675889]
We propose a two-stage clean samples identification method.
First, we employ a class-level feature clustering procedure for the early identification of clean samples.
Second, for the remaining clean samples that are close to the ground truth class boundary, we propose a novel consistency-based classification method.
arXiv Detail & Related papers (2022-07-29T04:54:57Z) - Prototype-Anchored Learning for Learning with Imperfect Annotations [83.7763875464011]
It is challenging to learn unbiased classification models from imperfectly annotated datasets.
We propose a prototype-anchored learning (PAL) method, which can be easily incorporated into various learning-based classification schemes.
We verify the effectiveness of PAL on class-imbalanced learning and noise-tolerant learning by extensive experiments on synthetic and real-world datasets.
arXiv Detail & Related papers (2022-06-23T10:25:37Z) - Benign Overfitting in Adversarially Robust Linear Classification [91.42259226639837]
"Benign overfitting", where classifiers memorize noisy training data yet still achieve a good generalization performance, has drawn great attention in the machine learning community.
We show that benign overfitting indeed occurs in adversarial training, a principled approach to defend against adversarial examples.
arXiv Detail & Related papers (2021-12-31T00:27:31Z) - Risk Bounds for Over-parameterized Maximum Margin Classification on
Sub-Gaussian Mixtures [100.55816326422773]
We study the phenomenon of the maximum margin classifier for linear classification problems.
Our results precisely characterize the condition under which benign overfitting can occur.
arXiv Detail & Related papers (2021-04-28T08:25:16Z) - Entropy-Based Uncertainty Calibration for Generalized Zero-Shot Learning [49.04790688256481]
The goal of generalized zero-shot learning (GZSL) is to recognise both seen and unseen classes.
Most GZSL methods typically learn to synthesise visual representations from semantic information on the unseen classes.
We propose a novel framework that leverages dual variational autoencoders with a triplet loss to learn discriminative latent features.
arXiv Detail & Related papers (2021-01-09T05:21:27Z) - Binary Classification of Gaussian Mixtures: Abundance of Support
Vectors, Benign Overfitting and Regularization [39.35822033674126]
We study binary linear classification under a generative Gaussian mixture model.
We derive novel non-asymptotic bounds on the classification error of the latter.
Our results extend to a noisy model with constant probability noise flips.
arXiv Detail & Related papers (2020-11-18T07:59:55Z) - Provable tradeoffs in adversarially robust classification [96.48180210364893]
We develop and leverage new tools, including recent breakthroughs from probability theory on robust isoperimetry.
Our results reveal fundamental tradeoffs between standard and robust accuracy that grow when data is imbalanced.
arXiv Detail & Related papers (2020-06-09T09:58:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.