On LASSO for High Dimensional Predictive Regression
- URL: http://arxiv.org/abs/2212.07052v2
- Date: Tue, 16 Jan 2024 15:19:40 GMT
- Title: On LASSO for High Dimensional Predictive Regression
- Authors: Ziwei Mei and Zhentao Shi
- Abstract summary: This paper examines LASSO, a widely-used $L_1$-penalized regression method, in high dimensional linear predictive regressions.
The consistency of LASSO is contingent upon two key components: the deviation bound of the cross product of the regressors and the error term.
Using machine learning and macroeconomic domain expertise, LASSO demonstrates strong performance in forecasting the unemployment rate.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper examines LASSO, a widely-used $L_{1}$-penalized regression method,
in high dimensional linear predictive regressions, particularly when the number
of potential predictors exceeds the sample size and numerous unit root
regressors are present. The consistency of LASSO is contingent upon two key
components: the deviation bound of the cross product of the regressors and the
error term, and the restricted eigenvalue of the Gram matrix. We present new
probabilistic bounds for these components, suggesting that LASSO's rates of
convergence are different from those typically observed in cross-sectional
cases. When applied to a mixture of stationary, nonstationary, and cointegrated
predictors, LASSO maintains its asymptotic guarantee if predictors are
scale-standardized. Leveraging machine learning and macroeconomic domain
expertise, LASSO demonstrates strong performance in forecasting the
unemployment rate, as evidenced by its application to the FRED-MD database.
Related papers
- On LASSO Inference for High Dimensional Predictive Regression [4.658398919599387]
We propose a novel estimator called IVX-desparsified LASSO (XDlasso)
XDlasso eliminates the shrinkage bias simultaneously.
We investigate the predictability of the U.S. stock returns based on the earnings-price ratio, and the predictability of the U.S. inflation using the unemployment rate.
arXiv Detail & Related papers (2024-09-16T06:41:58Z) - Risk and cross validation in ridge regression with correlated samples [72.59731158970894]
We provide training examples for the in- and out-of-sample risks of ridge regression when the data points have arbitrary correlations.
We further extend our analysis to the case where the test point has non-trivial correlations with the training set, setting often encountered in time series forecasting.
We validate our theory across a variety of high dimensional data.
arXiv Detail & Related papers (2024-08-08T17:27:29Z) - Scaling and renormalization in high-dimensional regression [72.59731158970894]
This paper presents a succinct derivation of the training and generalization performance of a variety of high-dimensional ridge regression models.
We provide an introduction and review of recent results on these topics, aimed at readers with backgrounds in physics and deep learning.
arXiv Detail & Related papers (2024-05-01T15:59:00Z) - Benign Overfitting of Constant-Stepsize SGD for Linear Regression [122.70478935214128]
inductive biases are central in preventing overfitting empirically.
This work considers this issue in arguably the most basic setting: constant-stepsize SGD for linear regression.
We reflect on a number of notable differences between the algorithmic regularization afforded by (unregularized) SGD in comparison to ordinary least squares.
arXiv Detail & Related papers (2021-03-23T17:15:53Z) - On the Generalization of Stochastic Gradient Descent with Momentum [58.900860437254885]
We first show that there exists a convex loss function for which algorithmic stability fails to establish generalization guarantees.
For smooth Lipschitz loss functions, we analyze a modified momentum-based update rule, and show that it admits an upper-bound on the generalization error.
For the special case of strongly convex loss functions, we find a range of momentum such that multiple epochs of standard SGDM, as a special form of SGDEM, also generalizes.
arXiv Detail & Related papers (2021-02-26T18:58:29Z) - The Predictive Normalized Maximum Likelihood for Over-parameterized
Linear Regression with Norm Constraint: Regret and Double Descent [12.929639356256928]
We show that modern machine learning models do not obey a trade-off between the complexity of a prediction rule and its ability to generalize.
We use the recently proposed predictive normalized maximum likelihood (pNML) which is the min-max regret solution for individual data.
We demonstrate the use of the pNML regret as a point-wise learnability measure on synthetic data and that it can successfully predict the double-decent phenomenon.
arXiv Detail & Related papers (2021-02-14T15:49:04Z) - CASTLE: Regularization via Auxiliary Causal Graph Discovery [89.74800176981842]
We introduce Causal Structure Learning (CASTLE) regularization and propose to regularize a neural network by jointly learning the causal relationships between variables.
CASTLE efficiently reconstructs only the features in the causal DAG that have a causal neighbor, whereas reconstruction-based regularizers suboptimally reconstruct all input features.
arXiv Detail & Related papers (2020-09-28T09:49:38Z) - Slice Sampling for General Completely Random Measures [74.24975039689893]
We present a novel Markov chain Monte Carlo algorithm for posterior inference that adaptively sets the truncation level using auxiliary slice variables.
The efficacy of the proposed algorithm is evaluated on several popular nonparametric models.
arXiv Detail & Related papers (2020-06-24T17:53:53Z) - Machine Learning Time Series Regressions with an Application to
Nowcasting [0.0]
This paper introduces structured machine learning regressions for high-dimensional time series data potentially sampled at different frequencies.
The sparse-group LASSO estimator can take advantage of such time series data structures and outperforms the unstructured LASSO.
arXiv Detail & Related papers (2020-05-28T14:42:58Z) - Interpolating Predictors in High-Dimensional Factor Regression [2.1055643409860743]
This work studies finite-sample properties of the risk of the minimum-norm interpolating predictor in high-dimensional regression models.
We show that the min-norm interpolating predictor can have similar risk to predictors based on principal components regression and ridge regression, and can improve over LASSO based predictors, in the high-dimensional regime.
arXiv Detail & Related papers (2020-02-06T22:08:36Z) - The Reciprocal Bayesian LASSO [0.0]
We consider a fully Bayesian formulation of the rLASSO problem, which is based on the observation that the rLASSO estimate for linear regression parameters can be interpreted as a Bayesian posterior mode estimate.
On simulated and real datasets, we show that the Bayesian formulation outperforms its classical cousin in estimation, prediction, and variable selection.
arXiv Detail & Related papers (2020-01-23T01:21:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.