High-dimensional regression with potential prior information on variable
importance
- URL: http://arxiv.org/abs/2109.11281v1
- Date: Thu, 23 Sep 2021 10:34:37 GMT
- Title: High-dimensional regression with potential prior information on variable
importance
- Authors: Benjamin G. Stokell, Rajen D. Shah
- Abstract summary: We propose a simple scheme involving fitting a sequence of models indicated by the ordering.
We show that the computational cost for fitting all models when ridge regression is used is no more than for a single fit of ridge regression.
We describe a strategy for Lasso regression that makes use of previous fits to greatly speed up fitting the entire sequence of models.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: There are a variety of settings where vague prior information may be
available on the importance of predictors in high-dimensional regression
settings. Examples include ordering on the variables offered by their empirical
variances (which is typically discarded through standardisation), the lag of
predictors when fitting autoregressive models in time series settings, or the
level of missingness of the variables. Whilst such orderings may not match the
true importance of variables, we argue that there is little to be lost, and
potentially much to be gained, by using them. We propose a simple scheme
involving fitting a sequence of models indicated by the ordering. We show that
the computational cost for fitting all models when ridge regression is used is
no more than for a single fit of ridge regression, and describe a strategy for
Lasso regression that makes use of previous fits to greatly speed up fitting
the entire sequence of models. We propose to select a final estimator by
cross-validation and provide a general result on the quality of the best
performing estimator on a test set selected from among a number $M$ of
competing estimators in a high-dimensional linear regression setting. Our
result requires no sparsity assumptions and shows that only a $\log M$ price is
incurred compared to the unknown best estimator. We demonstrate the
effectiveness of our approach when applied to missing or corrupted data, and
time series settings. An R package is available on github.
Related papers
- Adaptive Optimization for Prediction with Missing Data [6.800113478497425]
We show that some adaptive linear regression models are equivalent to learning an imputation rule and a downstream linear regression model simultaneously.
In settings where data is strongly not missing at random, our methods achieve a 2-10% improvement in out-of-sample accuracy.
arXiv Detail & Related papers (2024-02-02T16:35:51Z) - The Adaptive $τ$-Lasso: Robustness and Oracle Properties [12.06248959194646]
This paper introduces a new regularized version of the robust $tau$-regression estimator for analyzing high-dimensional datasets.
The resulting estimator, termed adaptive $tau$-Lasso, is robust to outliers and high-leverage points.
In the face of outliers and high-leverage points, the adaptive $tau$-Lasso and $tau$-Lasso estimators achieve the best performance or close-to-best performance.
arXiv Detail & Related papers (2023-04-18T21:34:14Z) - ecpc: An R-package for generic co-data models for high-dimensional
prediction [0.0]
R-package ecpc originally accommodated various and possibly multiple co-data sources.
We present an extension to the method and software for generic co-data models.
We show how ridge penalties may be transformed to elastic net penalties with the R-package squeezy.
arXiv Detail & Related papers (2022-05-16T12:55:19Z) - $p$-Generalized Probit Regression and Scalable Maximum Likelihood
Estimation via Sketching and Coresets [74.37849422071206]
We study the $p$-generalized probit regression model, which is a generalized linear model for binary responses.
We show how the maximum likelihood estimator for $p$-generalized probit regression can be approximated efficiently up to a factor of $(1+varepsilon)$ on large data.
arXiv Detail & Related papers (2022-03-25T10:54:41Z) - Bayesian Regression Approach for Building and Stacking Predictive Models
in Time Series Analytics [0.0]
The paper describes the use of Bayesian regression for building time series models and stacking different predictive models for time series.
It makes it possible to estimate an uncertainty of time series prediction and calculate value at risk characteristics.
arXiv Detail & Related papers (2022-01-06T12:58:23Z) - X-model: Improving Data Efficiency in Deep Learning with A Minimax Model [78.55482897452417]
We aim at improving data efficiency for both classification and regression setups in deep learning.
To take the power of both worlds, we propose a novel X-model.
X-model plays a minimax game between the feature extractor and task-specific heads.
arXiv Detail & Related papers (2021-10-09T13:56:48Z) - Regression Bugs Are In Your Model! Measuring, Reducing and Analyzing
Regressions In NLP Model Updates [68.09049111171862]
This work focuses on quantifying, reducing and analyzing regression errors in the NLP model updates.
We formulate the regression-free model updates into a constrained optimization problem.
We empirically analyze how model ensemble reduces regression.
arXiv Detail & Related papers (2021-05-07T03:33:00Z) - Flexible Model Aggregation for Quantile Regression [92.63075261170302]
Quantile regression is a fundamental problem in statistical learning motivated by a need to quantify uncertainty in predictions.
We investigate methods for aggregating any number of conditional quantile models.
All of the models we consider in this paper can be fit using modern deep learning toolkits.
arXiv Detail & Related papers (2021-02-26T23:21:16Z) - Ridge Regression Revisited: Debiasing, Thresholding and Bootstrap [4.142720557665472]
ridge regression may be worth another look since -- after debiasing and thresholding -- it may offer some advantages over the Lasso.
In this paper, we define a debiased and thresholded ridge regression method, and prove a consistency result and a Gaussian approximation theorem.
In addition to estimation, we consider the problem of prediction, and present a novel, hybrid bootstrap algorithm tailored for prediction intervals.
arXiv Detail & Related papers (2020-09-17T05:04:10Z) - Evaluating Prediction-Time Batch Normalization for Robustness under
Covariate Shift [81.74795324629712]
We call prediction-time batch normalization, which significantly improves model accuracy and calibration under covariate shift.
We show that prediction-time batch normalization provides complementary benefits to existing state-of-the-art approaches for improving robustness.
The method has mixed results when used alongside pre-training, and does not seem to perform as well under more natural types of dataset shift.
arXiv Detail & Related papers (2020-06-19T05:08:43Z) - SUMO: Unbiased Estimation of Log Marginal Probability for Latent
Variable Models [80.22609163316459]
We introduce an unbiased estimator of the log marginal likelihood and its gradients for latent variable models based on randomized truncation of infinite series.
We show that models trained using our estimator give better test-set likelihoods than a standard importance-sampling based approach for the same average computational cost.
arXiv Detail & Related papers (2020-04-01T11:49:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.