Beyond Ridge Regression for Distribution-Free Data
- URL: http://arxiv.org/abs/2206.08757v1
- Date: Fri, 17 Jun 2022 13:16:46 GMT
- Title: Beyond Ridge Regression for Distribution-Free Data
- Authors: Koby Bibas and Meir Feder
- Abstract summary: The normalized maximum likelihood (pNML) has been proposed as the min-max regret solution for the distribution-free setting, where no distributional assumptions are made on the data.
It has been suggested to use NML with luckiness'': A prior-like function is applied to the hypothesis class, which reduces its effective size.
The associated pNML with luckiness (LpNML) prediction deviates from the ridge regression empirical risk minimizer (Ridge ERM)
Our LpNML reduces the Ridge ERM error by up to 20% for the PMLB sets, and
- Score: 8.523307608620094
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In supervised batch learning, the predictive normalized maximum likelihood
(pNML) has been proposed as the min-max regret solution for the
distribution-free setting, where no distributional assumptions are made on the
data. However, the pNML is not defined for a large capacity hypothesis class as
over-parameterized linear regression. For a large class, a common approach is
to use regularization or a model prior. In the context of online prediction
where the min-max solution is the Normalized Maximum Likelihood (NML), it has
been suggested to use NML with ``luckiness'': A prior-like function is applied
to the hypothesis class, which reduces its effective size. Motivated by the
luckiness concept, for linear regression we incorporate a luckiness function
that penalizes the hypothesis proportionally to its l2 norm. This leads to the
ridge regression solution. The associated pNML with luckiness (LpNML)
prediction deviates from the ridge regression empirical risk minimizer (Ridge
ERM): When the test data reside in the subspace corresponding to the small
eigenvalues of the empirical correlation matrix of the training data, the
prediction is shifted toward 0. Our LpNML reduces the Ridge ERM error by up to
20% for the PMLB sets, and is up to 4.9% more robust in the presence of
distribution shift compared to recent leading methods for UCI sets.
Related papers
- Quantifying the Prediction Uncertainty of Machine Learning Models for Individual Data [2.1248439796866228]
This study investigates pNML's learnability for linear regression and neural networks.
It demonstrates that pNML can improve the performance and robustness of these models on various tasks.
arXiv Detail & Related papers (2024-12-10T13:58:19Z) - Revisiting Essential and Nonessential Settings of Evidential Deep Learning [70.82728812001807]
Evidential Deep Learning (EDL) is an emerging method for uncertainty estimation.
We propose Re-EDL, a simplified yet more effective variant of EDL.
arXiv Detail & Related papers (2024-10-01T04:27:07Z) - Deep Limit Model-free Prediction in Regression [0.0]
We provide a Model-free approach based on Deep Neural Network (DNN) to accomplish point prediction and prediction interval under a general regression setting.
Our method is more stable and accurate compared to other DNN-based counterparts, especially for optimal point predictions.
arXiv Detail & Related papers (2024-08-18T16:37:53Z) - Learning to Estimate Without Bias [57.82628598276623]
Gauss theorem states that the weighted least squares estimator is a linear minimum variance unbiased estimation (MVUE) in linear models.
In this paper, we take a first step towards extending this result to non linear settings via deep learning with bias constraints.
A second motivation to BCE is in applications where multiple estimates of the same unknown are averaged for improved performance.
arXiv Detail & Related papers (2021-10-24T10:23:51Z) - Near-optimal inference in adaptive linear regression [60.08422051718195]
Even simple methods like least squares can exhibit non-normal behavior when data is collected in an adaptive manner.
We propose a family of online debiasing estimators to correct these distributional anomalies in at least squares estimation.
We demonstrate the usefulness of our theory via applications to multi-armed bandit, autoregressive time series estimation, and active learning with exploration.
arXiv Detail & Related papers (2021-07-05T21:05:11Z) - Imputation-Free Learning from Incomplete Observations [73.15386629370111]
We introduce the importance of guided gradient descent (IGSGD) method to train inference from inputs containing missing values without imputation.
We employ reinforcement learning (RL) to adjust the gradients used to train the models via back-propagation.
Our imputation-free predictions outperform the traditional two-step imputation-based predictions using state-of-the-art imputation methods.
arXiv Detail & Related papers (2021-07-05T12:44:39Z) - Rapid Risk Minimization with Bayesian Models Through Deep Learning
Approximation [9.93116974480156]
We introduce a novel combination of Bayesian Models (BMs) and Neural Networks (NNs) for making predictions with a minimum expected risk.
Our approach combines the data efficiency and interpretability of a BM with the speed of a NN.
We achieve risk minimized predictions significantly faster than standard methods with a negligible loss on the testing dataset.
arXiv Detail & Related papers (2021-03-29T15:08:25Z) - SLOE: A Faster Method for Statistical Inference in High-Dimensional
Logistic Regression [68.66245730450915]
We develop an improved method for debiasing predictions and estimating frequentist uncertainty for practical datasets.
Our main contribution is SLOE, an estimator of the signal strength with convergence guarantees that reduces the computation time of estimation and inference by orders of magnitude.
arXiv Detail & Related papers (2021-03-23T17:48:56Z) - The Predictive Normalized Maximum Likelihood for Over-parameterized
Linear Regression with Norm Constraint: Regret and Double Descent [12.929639356256928]
We show that modern machine learning models do not obey a trade-off between the complexity of a prediction rule and its ability to generalize.
We use the recently proposed predictive normalized maximum likelihood (pNML) which is the min-max regret solution for individual data.
We demonstrate the use of the pNML regret as a point-wise learnability measure on synthetic data and that it can successfully predict the double-decent phenomenon.
arXiv Detail & Related papers (2021-02-14T15:49:04Z) - Amortized Conditional Normalized Maximum Likelihood: Reliable Out of
Distribution Uncertainty Estimation [99.92568326314667]
We propose the amortized conditional normalized maximum likelihood (ACNML) method as a scalable general-purpose approach for uncertainty estimation.
Our algorithm builds on the conditional normalized maximum likelihood (CNML) coding scheme, which has minimax optimal properties according to the minimum description length principle.
We demonstrate that ACNML compares favorably to a number of prior techniques for uncertainty estimation in terms of calibration on out-of-distribution inputs.
arXiv Detail & Related papers (2020-11-05T08:04:34Z) - Ridge Regression Revisited: Debiasing, Thresholding and Bootstrap [4.142720557665472]
ridge regression may be worth another look since -- after debiasing and thresholding -- it may offer some advantages over the Lasso.
In this paper, we define a debiased and thresholded ridge regression method, and prove a consistency result and a Gaussian approximation theorem.
In addition to estimation, we consider the problem of prediction, and present a novel, hybrid bootstrap algorithm tailored for prediction intervals.
arXiv Detail & Related papers (2020-09-17T05:04:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.