Regularized Linear Regression for Binary Classification
- URL: http://arxiv.org/abs/2311.02270v1
- Date: Fri, 3 Nov 2023 23:18:21 GMT
- Title: Regularized Linear Regression for Binary Classification
- Authors: Danil Akhtiamov, Reza Ghane and Babak Hassibi
- Abstract summary: Regularized linear regression is a promising approach for binary classification problems in which the training set has noisy labels.
We show that for large enough regularization strength, the optimal weights concentrate around two values of opposite sign.
We observe that in many cases the corresponding "compression" of each weight to a single bit leads to very little loss in performance.
- Score: 20.710343135282116
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Regularized linear regression is a promising approach for binary
classification problems in which the training set has noisy labels since the
regularization term can help to avoid interpolating the mislabeled data points.
In this paper we provide a systematic study of the effects of the
regularization strength on the performance of linear classifiers that are
trained to solve binary classification problems by minimizing a regularized
least-squares objective. We consider the over-parametrized regime and assume
that the classes are generated from a Gaussian Mixture Model (GMM) where a
fraction $c<\frac{1}{2}$ of the training data is mislabeled. Under these
assumptions, we rigorously analyze the classification errors resulting from the
application of ridge, $\ell_1$, and $\ell_\infty$ regression. In particular, we
demonstrate that ridge regression invariably improves the classification error.
We prove that $\ell_1$ regularization induces sparsity and observe that in many
cases one can sparsify the solution by up to two orders of magnitude without
any considerable loss of performance, even though the GMM has no underlying
sparsity structure. For $\ell_\infty$ regularization we show that, for large
enough regularization strength, the optimal weights concentrate around two
values of opposite sign. We observe that in many cases the corresponding
"compression" of each weight to a single bit leads to very little loss in
performance. These latter observations can have significant practical
ramifications.
Related papers
- Rethinking Classifier Re-Training in Long-Tailed Recognition: A Simple
Logits Retargeting Approach [102.0769560460338]
We develop a simple logits approach (LORT) without the requirement of prior knowledge of the number of samples per class.
Our method achieves state-of-the-art performance on various imbalanced datasets, including CIFAR100-LT, ImageNet-LT, and iNaturalist 2018.
arXiv Detail & Related papers (2024-03-01T03:27:08Z) - Generating Unbiased Pseudo-labels via a Theoretically Guaranteed
Chebyshev Constraint to Unify Semi-supervised Classification and Regression [57.17120203327993]
threshold-to-pseudo label process (T2L) in classification uses confidence to determine the quality of label.
In nature, regression also requires unbiased methods to generate high-quality labels.
We propose a theoretically guaranteed constraint for generating unbiased labels based on Chebyshev's inequality.
arXiv Detail & Related papers (2023-11-03T08:39:35Z) - Deep Imbalanced Regression via Hierarchical Classification Adjustment [50.19438850112964]
Regression tasks in computer vision are often formulated into classification by quantizing the target space into classes.
The majority of training samples lie in a head range of target values, while a minority of samples span a usually larger tail range.
We propose to construct hierarchical classifiers for solving imbalanced regression tasks.
Our novel hierarchical classification adjustment (HCA) for imbalanced regression shows superior results on three diverse tasks.
arXiv Detail & Related papers (2023-10-26T04:54:39Z) - Out-Of-Domain Unlabeled Data Improves Generalization [0.7589678255312519]
We propose a novel framework for incorporating unlabeled data into semi-supervised classification problems.
We show that unlabeled samples can be harnessed to narrow the generalization gap.
We validate our claims through experiments conducted on a variety of synthetic and real-world datasets.
arXiv Detail & Related papers (2023-09-29T02:00:03Z) - Boosting Differentiable Causal Discovery via Adaptive Sample Reweighting [62.23057729112182]
Differentiable score-based causal discovery methods learn a directed acyclic graph from observational data.
We propose a model-agnostic framework to boost causal discovery performance by dynamically learning the adaptive weights for the Reweighted Score function, ReScore.
arXiv Detail & Related papers (2023-03-06T14:49:59Z) - The Implicit Bias of Benign Overfitting [31.714928102950584]
benign overfitting is where a predictor perfectly fits noisy training data while attaining near-optimal expected loss.
We show how this can be extended beyond standard linear regression.
We then turn to classification problems, and show that the situation there is much more favorable.
arXiv Detail & Related papers (2022-01-27T12:49:21Z) - Benign Overfitting in Adversarially Robust Linear Classification [91.42259226639837]
"Benign overfitting", where classifiers memorize noisy training data yet still achieve a good generalization performance, has drawn great attention in the machine learning community.
We show that benign overfitting indeed occurs in adversarial training, a principled approach to defend against adversarial examples.
arXiv Detail & Related papers (2021-12-31T00:27:31Z) - Robust Neural Network Classification via Double Regularization [2.41710192205034]
We propose a novel double regularization of the neural network training loss that combines a penalty on the complexity of the classification model and an optimal reweighting of training observations.
We demonstrate DRFit, for neural net classification of (i) MNIST and (ii) CIFAR-10, in both cases with simulated mislabeling.
arXiv Detail & Related papers (2021-12-15T13:19:20Z) - Learning Gaussian Mixtures with Generalised Linear Models: Precise
Asymptotics in High-dimensions [79.35722941720734]
Generalised linear models for multi-class classification problems are one of the fundamental building blocks of modern machine learning tasks.
We prove exacts characterising the estimator in high-dimensions via empirical risk minimisation.
We discuss how our theory can be applied beyond the scope of synthetic data.
arXiv Detail & Related papers (2021-06-07T16:53:56Z) - Label-Imbalanced and Group-Sensitive Classification under
Overparameterization [32.923780772605596]
Label-imbalanced and group-sensitive classification seeks to appropriately modify standard training algorithms to optimize relevant metrics.
We show that a logit-adjusted loss modification to standard empirical risk minimization might be ineffective in general.
We show that our results extend naturally to binary classification with sensitive groups, thus treating the two common types of imbalances (label/group) in a unifying way.
arXiv Detail & Related papers (2021-03-02T08:09:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.