Sharp Asymptotics and Optimal Performance for Inference in Binary Models
- URL: http://arxiv.org/abs/2002.07284v2
- Date: Wed, 26 Feb 2020 06:14:29 GMT
- Title: Sharp Asymptotics and Optimal Performance for Inference in Binary Models
- Authors: Hossein Taheri, Ramtin Pedarsani, and Christos Thrampoulidis
- Abstract summary: We study convex empirical risk for high-dimensional inference in binary models.
For binary linear classification under the Logistic and Probit models, we prove that the performance of least-squares is no worse than 0.997 and 0.98 times the optimal one.
- Score: 41.7567932118769
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We study convex empirical risk minimization for high-dimensional inference in
binary models. Our first result sharply predicts the statistical performance of
such estimators in the linear asymptotic regime under isotropic Gaussian
features. Importantly, the predictions hold for a wide class of convex loss
functions, which we exploit in order to prove a bound on the best achievable
performance among them. Notably, we show that the proposed bound is tight for
popular binary models (such as Signed, Logistic or Probit), by constructing
appropriate loss functions that achieve it. More interestingly, for binary
linear classification under the Logistic and Probit models, we prove that the
performance of least-squares is no worse than 0.997 and 0.98 times the optimal
one. Numerical simulations corroborate our theoretical findings and suggest
they are accurate even for relatively small problem dimensions.
Related papers
- Optimal convex $M$-estimation via score matching [6.115859302936817]
We construct a data-driven convex loss function with respect to which empirical risk minimisation yields optimal variance in the downstream estimation of the regression coefficients.
Our semiparametric approach targets the best decreasing approximation of the derivative of the derivative of the log-density of the noise distribution.
arXiv Detail & Related papers (2024-03-25T12:23:19Z) - Fundamental limits of Non-Linear Low-Rank Matrix Estimation [18.455890316339595]
Bayes-optimal performances are characterized by an equivalent Gaussian model with an effective prior.
We show that to reconstruct the signal accurately, one requires a signal-to-noise ratio growing as $Nfrac 12 (1-1/k_F)$, where $k_F$ is the first non-zero Fisher information coefficient of the function.
arXiv Detail & Related papers (2024-03-07T05:26:52Z) - Learning Unnormalized Statistical Models via Compositional Optimization [73.30514599338407]
Noise-contrastive estimation(NCE) has been proposed by formulating the objective as the logistic loss of the real data and the artificial noise.
In this paper, we study it a direct approach for optimizing the negative log-likelihood of unnormalized models.
arXiv Detail & Related papers (2023-06-13T01:18:16Z) - Off-policy estimation of linear functionals: Non-asymptotic theory for
semi-parametric efficiency [59.48096489854697]
The problem of estimating a linear functional based on observational data is canonical in both the causal inference and bandit literatures.
We prove non-asymptotic upper bounds on the mean-squared error of such procedures.
We establish its instance-dependent optimality in finite samples via matching non-asymptotic local minimax lower bounds.
arXiv Detail & Related papers (2022-09-26T23:50:55Z) - Exploration-exploitation trade-off for continuous-time episodic
reinforcement learning with linear-convex models [2.503869683354711]
We study finite-time horizon control problems with linear dynamics but unknown coefficients and convex, but possibly irregular, objective function.
We identify conditions under which this performance gap is quadratic, improving the linear performance gap in recent work.
Next, we propose a phase-based learning algorithm for which we show how to optimise exploration-exploitation trade-off and achieve sublinear regrets.
arXiv Detail & Related papers (2021-12-19T21:47:04Z) - Near-optimal inference in adaptive linear regression [60.08422051718195]
Even simple methods like least squares can exhibit non-normal behavior when data is collected in an adaptive manner.
We propose a family of online debiasing estimators to correct these distributional anomalies in at least squares estimation.
We demonstrate the usefulness of our theory via applications to multi-armed bandit, autoregressive time series estimation, and active learning with exploration.
arXiv Detail & Related papers (2021-07-05T21:05:11Z) - Near-Optimal High Probability Complexity Bounds for Non-Smooth
Stochastic Optimization with Heavy-Tailed Noise [63.304196997102494]
It is essential to theoretically guarantee that algorithms provide small objective residual with high probability.
Existing methods for non-smooth convex optimization have complexity with bounds dependence on the confidence level that is either negative-power or logarithmic.
We propose novel stepsize rules for two gradient methods with clipping.
arXiv Detail & Related papers (2021-06-10T17:54:21Z) - Convex and Nonconvex Optimization Are Both Minimax-Optimal for Noisy
Blind Deconvolution under Random Designs [12.089409241521185]
We investigate the effectiveness of convex relaxation and nonoptimal optimization in solving bi$a-vis random noise.
Results significantly improve upon the state-of-the-art theoretical guarantees.
arXiv Detail & Related papers (2020-08-04T17:57:02Z) - Non-parametric Models for Non-negative Functions [48.7576911714538]
We provide the first model for non-negative functions from the same good linear models.
We prove that it admits a representer theorem and provide an efficient dual formulation for convex problems.
arXiv Detail & Related papers (2020-07-08T07:17:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.