Efficient Truncated Linear Regression with Unknown Noise Variance
- URL: http://arxiv.org/abs/2208.12042v1
- Date: Thu, 25 Aug 2022 12:17:37 GMT
- Title: Efficient Truncated Linear Regression with Unknown Noise Variance
- Authors: Constantinos Daskalakis, Patroklos Stefanou, Rui Yao, Manolis
- Abstract summary: We provide the first computationally and statistically efficient estimators for truncated linear regression when the noise variance is unknown.
Our estimator is based on an efficient implementation of Projected Gradient Descent on the negative-likelihood of the truncated sample.
- Score: 26.870279729431328
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Truncated linear regression is a classical challenge in Statistics, wherein a
label, $y = w^T x + \varepsilon$, and its corresponding feature vector, $x \in
\mathbb{R}^k$, are only observed if the label falls in some subset $S \subseteq
\mathbb{R}$; otherwise the existence of the pair $(x, y)$ is hidden from
observation. Linear regression with truncated observations has remained a
challenge, in its general form, since the early works
of~\citet{tobin1958estimation,amemiya1973regression}. When the distribution of
the error is normal with known variance, recent work
of~\citet{daskalakis2019truncatedregression} provides computationally and
statistically efficient estimators of the linear model, $w$.
In this paper, we provide the first computationally and statistically
efficient estimators for truncated linear regression when the noise variance is
unknown, estimating both the linear model and the variance of the noise. Our
estimator is based on an efficient implementation of Projected Stochastic
Gradient Descent on the negative log-likelihood of the truncated sample.
Importantly, we show that the error of our estimates is asymptotically normal,
and we use this to provide explicit confidence regions for our estimates.
Related papers
- Spurious Correlations in High Dimensional Regression: The Roles of Regularization, Simplicity Bias and Over-Parameterization [19.261178173399784]
Learning models have been shown to rely on spurious correlations between non-predictive features and the associated labels in the training data.
We quantify the amount of spurious correlations $C$ learned via linear regression, in terms of the data covariance and the strength $lambda$ of the ridge regularization.
arXiv Detail & Related papers (2025-02-03T13:38:42Z) - Retire: Robust Expectile Regression in High Dimensions [3.9391041278203978]
Penalized quantile and expectile regression methods offer useful tools to detect heteroscedasticity in high-dimensional data.
We propose and study (penalized) robust expectile regression (retire)
We show that the proposed procedure can be efficiently solved by a semismooth Newton coordinate descent algorithm.
arXiv Detail & Related papers (2022-12-11T18:03:12Z) - $p$-Generalized Probit Regression and Scalable Maximum Likelihood
Estimation via Sketching and Coresets [74.37849422071206]
We study the $p$-generalized probit regression model, which is a generalized linear model for binary responses.
We show how the maximum likelihood estimator for $p$-generalized probit regression can be approximated efficiently up to a factor of $(1+varepsilon)$ on large data.
arXiv Detail & Related papers (2022-03-25T10:54:41Z) - Optimal Online Generalized Linear Regression with Stochastic Noise and
Its Application to Heteroscedastic Bandits [88.6139446295537]
We study the problem of online generalized linear regression in the setting of a generalized linear model with possibly unbounded additive noise.
We provide a sharp analysis of the classical follow-the-regularized-leader (FTRL) algorithm to cope with the label noise.
We propose an algorithm based on FTRL to achieve the first variance-aware regret bound.
arXiv Detail & Related papers (2022-02-28T08:25:26Z) - Non-Asymptotic Guarantees for Robust Statistical Learning under
$(1+\varepsilon)$-th Moment Assumption [0.716879432974126]
This paper proposes a log-truncated M-mestiator for a large family of statistical regressions.
We show the superiority of log-truncated estimations over standard estimations.
arXiv Detail & Related papers (2022-01-10T06:22:30Z) - Heavy-tailed Streaming Statistical Estimation [58.70341336199497]
We consider the task of heavy-tailed statistical estimation given streaming $p$ samples.
We design a clipped gradient descent and provide an improved analysis under a more nuanced condition on the noise of gradients.
arXiv Detail & Related papers (2021-08-25T21:30:27Z) - Performance of Bayesian linear regression in a model with mismatch [8.60118148262922]
We analyze the performance of an estimator given by the mean of a log-concave Bayesian posterior distribution with gaussian prior.
This inference model can be rephrased as a version of the Gardner model in spin glasses.
arXiv Detail & Related papers (2021-07-14T18:50:13Z) - Understanding the Under-Coverage Bias in Uncertainty Estimation [58.03725169462616]
quantile regression tends to emphunder-cover than the desired coverage level in reality.
We prove that quantile regression suffers from an inherent under-coverage bias.
Our theory reveals that this under-coverage bias stems from a certain high-dimensional parameter estimation error.
arXiv Detail & Related papers (2021-06-10T06:11:55Z) - SLOE: A Faster Method for Statistical Inference in High-Dimensional
Logistic Regression [68.66245730450915]
We develop an improved method for debiasing predictions and estimating frequentist uncertainty for practical datasets.
Our main contribution is SLOE, an estimator of the signal strength with convergence guarantees that reduces the computation time of estimation and inference by orders of magnitude.
arXiv Detail & Related papers (2021-03-23T17:48:56Z) - Online nonparametric regression with Sobolev kernels [99.12817345416846]
We derive the regret upper bounds on the classes of Sobolev spaces $W_pbeta(mathcalX)$, $pgeq 2, beta>fracdp$.
The upper bounds are supported by the minimax regret analysis, which reveals that in the cases $beta> fracd2$ or $p=infty$ these rates are (essentially) optimal.
arXiv Detail & Related papers (2021-02-06T15:05:14Z) - Computationally and Statistically Efficient Truncated Regression [36.3677715543994]
We provide a computationally and statistically efficient estimator for the classical problem of truncated linear regression.
Our estimator uses Projected Descent Gradient (PSGD) without replacement on the negative log-likelihood of the truncated sample.
As a corollary, we show that SGD learns the parameters of single-layer neural networks with noisy activation functions.
arXiv Detail & Related papers (2020-10-22T19:31:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.