Interpolating Predictors in High-Dimensional Factor Regression
- URL: http://arxiv.org/abs/2002.02525v3
- Date: Sat, 20 Mar 2021 22:48:52 GMT
- Title: Interpolating Predictors in High-Dimensional Factor Regression
- Authors: Florentina Bunea, Seth Strimas-Mackey, Marten Wegkamp
- Abstract summary: This work studies finite-sample properties of the risk of the minimum-norm interpolating predictor in high-dimensional regression models.
We show that the min-norm interpolating predictor can have similar risk to predictors based on principal components regression and ridge regression, and can improve over LASSO based predictors, in the high-dimensional regime.
- Score: 2.1055643409860743
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This work studies finite-sample properties of the risk of the minimum-norm
interpolating predictor in high-dimensional regression models. If the effective
rank of the covariance matrix $\Sigma$ of the $p$ regression features is much
larger than the sample size $n$, we show that the min-norm interpolating
predictor is not desirable, as its risk approaches the risk of trivially
predicting the response by 0. However, our detailed finite-sample analysis
reveals, surprisingly, that this behavior is not present when the regression
response and the features are {\it jointly} low-dimensional, following a widely
used factor regression model. Within this popular model class, and when the
effective rank of $\Sigma$ is smaller than $n$, while still allowing for $p \gg
n$, both the bias and the variance terms of the excess risk can be controlled,
and the risk of the minimum-norm interpolating predictor approaches optimal
benchmarks. Moreover, through a detailed analysis of the bias term, we exhibit
model classes under which our upper bound on the excess risk approaches zero,
while the corresponding upper bound in the recent work arXiv:1906.11300
diverges. Furthermore, we show that the minimum-norm interpolating predictor
analyzed under the factor regression model, despite being model-agnostic and
devoid of tuning parameters, can have similar risk to predictors based on
principal components regression and ridge regression, and can improve over
LASSO based predictors, in the high-dimensional regime.
Related papers
- Risk and cross validation in ridge regression with correlated samples [72.59731158970894]
We provide training examples for the in- and out-of-sample risks of ridge regression when the data points have arbitrary correlations.
We further extend our analysis to the case where the test point has non-trivial correlations with the training set, setting often encountered in time series forecasting.
We validate our theory across a variety of high dimensional data.
arXiv Detail & Related papers (2024-08-08T17:27:29Z) - Ridge interpolators in correlated factor regression models -- exact risk analysis [0.0]
We consider correlated emphfactor regression models (FRM) and analyze the performance of classical ridge interpolators.
We provide emphexcess prediction risk characterizations that clearly show the dependence on all key model parameters.
arXiv Detail & Related papers (2024-06-13T14:46:08Z) - Scaling and renormalization in high-dimensional regression [72.59731158970894]
This paper presents a succinct derivation of the training and generalization performance of a variety of high-dimensional ridge regression models.
We provide an introduction and review of recent results on these topics, aimed at readers with backgrounds in physics and deep learning.
arXiv Detail & Related papers (2024-05-01T15:59:00Z) - Transfer Learning with Random Coefficient Ridge Regression [2.0813318162800707]
Ridge regression with random coefficients provides an important alternative to fixed coefficients regression in high dimensional setting.
This paper considers estimation and prediction of random coefficient ridge regression in the setting of transfer learning.
arXiv Detail & Related papers (2023-06-28T04:36:37Z) - Prediction Risk and Estimation Risk of the Ridgeless Least Squares Estimator under General Assumptions on Regression Errors [10.857775300638831]
We explore prediction risk as well as estimation risk under more general regression error assumptions.
Our findings suggest that the benefits of over parameterization can extend to time series, panel and grouped data.
arXiv Detail & Related papers (2023-05-22T10:04:20Z) - Mitigating multiple descents: A model-agnostic framework for risk
monotonization [84.6382406922369]
We develop a general framework for risk monotonization based on cross-validation.
We propose two data-driven methodologies, namely zero- and one-step, that are akin to bagging and boosting.
arXiv Detail & Related papers (2022-05-25T17:41:40Z) - SLOE: A Faster Method for Statistical Inference in High-Dimensional
Logistic Regression [68.66245730450915]
We develop an improved method for debiasing predictions and estimating frequentist uncertainty for practical datasets.
Our main contribution is SLOE, an estimator of the signal strength with convergence guarantees that reduces the computation time of estimation and inference by orders of magnitude.
arXiv Detail & Related papers (2021-03-23T17:48:56Z) - Benign Overfitting of Constant-Stepsize SGD for Linear Regression [122.70478935214128]
inductive biases are central in preventing overfitting empirically.
This work considers this issue in arguably the most basic setting: constant-stepsize SGD for linear regression.
We reflect on a number of notable differences between the algorithmic regularization afforded by (unregularized) SGD in comparison to ordinary least squares.
arXiv Detail & Related papers (2021-03-23T17:15:53Z) - Online nonparametric regression with Sobolev kernels [99.12817345416846]
We derive the regret upper bounds on the classes of Sobolev spaces $W_pbeta(mathcalX)$, $pgeq 2, beta>fracdp$.
The upper bounds are supported by the minimax regret analysis, which reveals that in the cases $beta> fracd2$ or $p=infty$ these rates are (essentially) optimal.
arXiv Detail & Related papers (2021-02-06T15:05:14Z) - Dimensionality reduction, regularization, and generalization in
overparameterized regressions [8.615625517708324]
We show that PCA-OLS, also known as principal component regression, can be avoided with a dimensionality reduction.
We show that dimensionality reduction improves robustness while OLS is arbitrarily susceptible to adversarial attacks.
We find that methods in which the projection depends on the training data can outperform methods where the projections are chosen independently of the training data.
arXiv Detail & Related papers (2020-11-23T15:38:50Z) - Prediction in latent factor regression: Adaptive PCR and beyond [2.9439848714137447]
We prove a master theorem that establishes a risk bound for a large class of predictors.
We use our main theorem to recover known risk bounds for the minimum-norm interpolating predictor.
We conclude with a detailed simulation study to support and complement our theoretical results.
arXiv Detail & Related papers (2020-07-20T12:42:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.