Dimension Reduction and MARS
- URL: http://arxiv.org/abs/2302.05790v2
- Date: Tue, 4 Jul 2023 20:58:19 GMT
- Title: Dimension Reduction and MARS
- Authors: Yu Liu, Degui Li, Yingcun Xia
- Abstract summary: The adaptive regression spline (MARS) is one of the popular estimation methods for nonparametric multivariate regressions.
In this paper, we improve the performance of MARS by using linear combinations of the covariates which achieve sufficient dimension reduction.
Numerical studies and empirical applications show its effectiveness and improvement over MARS and other commonly-used nonparametric methods in regression estimation and prediction.
- Score: 4.525349089861123
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: The multivariate adaptive regression spline (MARS) is one of the popular
estimation methods for nonparametric multivariate regressions. However, as MARS
is based on marginal splines, to incorporate interactions of covariates,
products of the marginal splines must be used, which leads to an unmanageable
number of basis functions when the order of interaction is high and results in
low estimation efficiency. In this paper, we improve the performance of MARS by
using linear combinations of the covariates which achieve sufficient dimension
reduction. The special basis functions of MARS facilitate calculation of
gradients of the regression function, and estimation of the linear combinations
is obtained via eigen-analysis of the outer-product of the gradients. Under
some technical conditions, the asymptotic theory is established for the
proposed estimation method. Numerical studies including both simulation and
empirical applications show its effectiveness in dimension reduction and
improvement over MARS and other commonly-used nonparametric methods in
regression estimation and prediction.
Related papers
- Method-of-Moments Inference for GLMs and Doubly Robust Functionals under Proportional Asymptotics [30.324051162373973]
We consider the estimation of regression coefficients and signal-to-noise ratio in high-dimensional Generalized Linear Models (GLMs)
We derive Consistent and Asymptotically Normal (CAN) estimators of our targets of inference.
We complement our theoretical results with numerical experiments and comparisons with existing literature.
arXiv Detail & Related papers (2024-08-12T12:43:30Z) - Unveiling the Cycloid Trajectory of EM Iterations in Mixed Linear Regression [5.883916678819683]
We study the trajectory of iterations and the convergence rates of the Expectation-Maximization (EM) algorithm for two-component Mixed Linear Regression (2MLR)
Recent results have established the super-linear convergence of EM for 2MLR in the noiseless and high SNR settings.
arXiv Detail & Related papers (2024-05-28T14:46:20Z) - Kernel-based off-policy estimation without overlap: Instance optimality
beyond semiparametric efficiency [53.90687548731265]
We study optimal procedures for estimating a linear functional based on observational data.
For any convex and symmetric function class $mathcalF$, we derive a non-asymptotic local minimax bound on the mean-squared error.
arXiv Detail & Related papers (2023-01-16T02:57:37Z) - Vector-Valued Least-Squares Regression under Output Regularity
Assumptions [73.99064151691597]
We propose and analyse a reduced-rank method for solving least-squares regression problems with infinite dimensional output.
We derive learning bounds for our method, and study under which setting statistical performance is improved in comparison to full-rank method.
arXiv Detail & Related papers (2022-11-16T15:07:00Z) - Optimization of Annealed Importance Sampling Hyperparameters [77.34726150561087]
Annealed Importance Sampling (AIS) is a popular algorithm used to estimates the intractable marginal likelihood of deep generative models.
We present a parameteric AIS process with flexible intermediary distributions and optimize the bridging distributions to use fewer number of steps for sampling.
We assess the performance of our optimized AIS for marginal likelihood estimation of deep generative models and compare it to other estimators.
arXiv Detail & Related papers (2022-09-27T07:58:25Z) - Sparse high-dimensional linear regression with a partitioned empirical
Bayes ECM algorithm [62.997667081978825]
We propose a computationally efficient and powerful Bayesian approach for sparse high-dimensional linear regression.
Minimal prior assumptions on the parameters are used through the use of plug-in empirical Bayes estimates.
The proposed approach is implemented in the R package probe.
arXiv Detail & Related papers (2022-09-16T19:15:50Z) - Robust Regularized Low-Rank Matrix Models for Regression and
Classification [14.698622796774634]
We propose a framework for matrix variate regression models based on a rank constraint, vector regularization (e.g., sparsity), and a general loss function.
We show that the algorithm is guaranteed to converge, all accumulation points of the algorithm have estimation errors in the order of $O(sqrtn)$ally and substantially attaining the minimax rate.
arXiv Detail & Related papers (2022-05-14T18:03:48Z) - MARS via LASSO [1.5199437137239338]
We propose and study a natural variant of the MARS method.
Our method is based on at least squares estimation over a convex class of functions.
Our estimator can be computed via finite-dimensional convex optimization.
arXiv Detail & Related papers (2021-11-23T07:30:33Z) - Differentiable Annealed Importance Sampling and the Perils of Gradient
Noise [68.44523807580438]
Annealed importance sampling (AIS) and related algorithms are highly effective tools for marginal likelihood estimation.
Differentiability is a desirable property as it would admit the possibility of optimizing marginal likelihood as an objective.
We propose a differentiable algorithm by abandoning Metropolis-Hastings steps, which further unlocks mini-batch computation.
arXiv Detail & Related papers (2021-07-21T17:10:14Z) - Benign Overfitting of Constant-Stepsize SGD for Linear Regression [122.70478935214128]
inductive biases are central in preventing overfitting empirically.
This work considers this issue in arguably the most basic setting: constant-stepsize SGD for linear regression.
We reflect on a number of notable differences between the algorithmic regularization afforded by (unregularized) SGD in comparison to ordinary least squares.
arXiv Detail & Related papers (2021-03-23T17:15:53Z) - Support estimation in high-dimensional heteroscedastic mean regression [2.28438857884398]
We consider a linear mean regression model with random design and potentially heteroscedastic, heavy-tailed errors.
We use a strictly convex, smooth variant of the Huber loss function with tuning parameter depending on the parameters of the problem.
For the resulting estimator we show sign-consistency and optimal rates of convergence in the $ell_infty$ norm.
arXiv Detail & Related papers (2020-11-03T09:46:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.