Weighted Lasso Estimates for Sparse Logistic Regression: Non-asymptotic
Properties with Measurement Error
- URL: http://arxiv.org/abs/2006.06136v1
- Date: Thu, 11 Jun 2020 00:58:14 GMT
- Title: Weighted Lasso Estimates for Sparse Logistic Regression: Non-asymptotic
Properties with Measurement Error
- Authors: Huamei Huang, Yujing Gao, Huiming Zhang, and Bo Li
- Abstract summary: Two types of weighted Lasso estimates are proposed for $ell_1$-penalized logistic regression.
We show that the finite sample behavior of our proposed methods with a diverging number of predictors is illustrated by non-asymptotic oracle inequalities.
We compare the performance of our methods with former weighted estimates on simulated data, then apply these methods to do real data analysis.
- Score: 5.5233023574863624
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: When we are interested in high-dimensional system and focus on classification
performance, the $\ell_{1}$-penalized logistic regression is becoming important
and popular. However, the Lasso estimates could be problematic when penalties
of different coefficients are all the same and not related to the data. We
proposed two types of weighted Lasso estimates depending on covariates by the
McDiarmid inequality. Given sample size $n$ and dimension of covariates $p$,
the finite sample behavior of our proposed methods with a diverging number of
predictors is illustrated by non-asymptotic oracle inequalities such as
$\ell_{1}$-estimation error and squared prediction error of the unknown
parameters. We compare the performance of our methods with former weighted
estimates on simulated data, then apply these methods to do real data analysis.
Related papers
- Semiparametric conformal prediction [79.6147286161434]
Risk-sensitive applications require well-calibrated prediction sets over multiple, potentially correlated target variables.
We treat the scores as random vectors and aim to construct the prediction set accounting for their joint correlation structure.
We report desired coverage and competitive efficiency on a range of real-world regression problems.
arXiv Detail & Related papers (2024-11-04T14:29:02Z) - Statistical Inference in Classification of High-dimensional Gaussian Mixture [1.2354076490479515]
We investigate the behavior of a general class of regularized convex classifiers in the high-dimensional limit.
Our focus is on the generalization error and variable selection properties of the estimators.
arXiv Detail & Related papers (2024-10-25T19:58:36Z) - Multivariate root-n-consistent smoothing parameter free matching estimators and estimators of inverse density weighted expectations [51.000851088730684]
We develop novel modifications of nearest-neighbor and matching estimators which converge at the parametric $sqrt n $-rate.
We stress that our estimators do not involve nonparametric function estimators and in particular do not rely on sample-size dependent parameters smoothing.
arXiv Detail & Related papers (2024-07-11T13:28:34Z) - Analysis of Bootstrap and Subsampling in High-dimensional Regularized Regression [29.57766164934947]
We investigate popular resampling methods for estimating the uncertainty of statistical models.
We provide a tight description of the biases and variances estimated by these methods in the context of generalized linear models.
arXiv Detail & Related papers (2024-02-21T08:50:33Z) - On semi-supervised estimation using exponential tilt mixture models [12.347498345854715]
Consider a semi-supervised setting with a labeled dataset of binary responses and predictors and an unlabeled dataset with only predictors.
For semi-supervised estimation, we develop further analysis and understanding of a statistical approach using exponential tilt mixture (ETM) models.
arXiv Detail & Related papers (2023-11-14T19:53:26Z) - Selective Nonparametric Regression via Testing [54.20569354303575]
We develop an abstention procedure via testing the hypothesis on the value of the conditional variance at a given point.
Unlike existing methods, the proposed one allows to account not only for the value of the variance itself but also for the uncertainty of the corresponding variance predictor.
arXiv Detail & Related papers (2023-09-28T13:04:11Z) - Retire: Robust Expectile Regression in High Dimensions [3.9391041278203978]
Penalized quantile and expectile regression methods offer useful tools to detect heteroscedasticity in high-dimensional data.
We propose and study (penalized) robust expectile regression (retire)
We show that the proposed procedure can be efficiently solved by a semismooth Newton coordinate descent algorithm.
arXiv Detail & Related papers (2022-12-11T18:03:12Z) - Heavy-tailed Streaming Statistical Estimation [58.70341336199497]
We consider the task of heavy-tailed statistical estimation given streaming $p$ samples.
We design a clipped gradient descent and provide an improved analysis under a more nuanced condition on the noise of gradients.
arXiv Detail & Related papers (2021-08-25T21:30:27Z) - SLOE: A Faster Method for Statistical Inference in High-Dimensional
Logistic Regression [68.66245730450915]
We develop an improved method for debiasing predictions and estimating frequentist uncertainty for practical datasets.
Our main contribution is SLOE, an estimator of the signal strength with convergence guarantees that reduces the computation time of estimation and inference by orders of magnitude.
arXiv Detail & Related papers (2021-03-23T17:48:56Z) - Doubly Robust Semiparametric Difference-in-Differences Estimators with
High-Dimensional Data [15.27393561231633]
We propose a doubly robust two-stage semiparametric difference-in-difference estimator for estimating heterogeneous treatment effects.
The first stage allows a general set of machine learning methods to be used to estimate the propensity score.
In the second stage, we derive the rates of convergence for both the parametric parameter and the unknown function.
arXiv Detail & Related papers (2020-09-07T15:14:29Z) - Asymptotic Analysis of an Ensemble of Randomly Projected Linear
Discriminants [94.46276668068327]
In [1], an ensemble of randomly projected linear discriminants is used to classify datasets.
We develop a consistent estimator of the misclassification probability as an alternative to the computationally-costly cross-validation estimator.
We also demonstrate the use of our estimator for tuning the projection dimension on both real and synthetic data.
arXiv Detail & Related papers (2020-04-17T12:47:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.