Multinomial Logistic Regression: Asymptotic Normality on Null Covariates
in High-Dimensions
- URL: http://arxiv.org/abs/2305.17825v1
- Date: Sun, 28 May 2023 23:33:41 GMT
- Title: Multinomial Logistic Regression: Asymptotic Normality on Null Covariates
in High-Dimensions
- Authors: Kai Tan and Pierre C. Bellec
- Abstract summary: This paper investigates the distribution of the maximum-likelihood estimate (MLE) in multinomial logistic models in the high-dimensional regime where dimension and sample size are of the same order.
- Score: 11.69389391551085
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: This paper investigates the asymptotic distribution of the maximum-likelihood
estimate (MLE) in multinomial logistic models in the high-dimensional regime
where dimension and sample size are of the same order. While classical
large-sample theory provides asymptotic normality of the MLE under certain
conditions, such classical results are expected to fail in high-dimensions as
documented for the binary logistic case in the seminal work of Sur and Cand\`es
[2019]. We address this issue in classification problems with 3 or more
classes, by developing asymptotic normality and asymptotic chi-square results
for the multinomial logistic MLE (also known as cross-entropy minimizer) on
null covariates. Our theory leads to a new methodology to test the significance
of a given feature. Extensive simulation studies on synthetic data corroborate
these asymptotic results and confirm the validity of proposed p-values for
testing the significance of a given feature.
Related papers
- Finite-sample performance of the maximum likelihood estimator in logistic regression [3.7550827441501844]
We consider the predictive performance of the maximum likelihood estimator (MLE) for logistic regression.
We obtain sharp non-asymptotic guarantees for the existence and excess logistic risk of the MLE.
arXiv Detail & Related papers (2024-11-04T14:50:15Z) - Statistical Inference in Classification of High-dimensional Gaussian Mixture [1.2354076490479515]
We investigate the behavior of a general class of regularized convex classifiers in the high-dimensional limit.
Our focus is on the generalization error and variable selection properties of the estimators.
arXiv Detail & Related papers (2024-10-25T19:58:36Z) - Method-of-Moments Inference for GLMs and Doubly Robust Functionals under Proportional Asymptotics [30.324051162373973]
We consider the estimation of regression coefficients and signal-to-noise ratio in high-dimensional Generalized Linear Models (GLMs)
We derive Consistent and Asymptotically Normal (CAN) estimators of our targets of inference.
We complement our theoretical results with numerical experiments and comparisons with existing literature.
arXiv Detail & Related papers (2024-08-12T12:43:30Z) - A non-asymptotic distributional theory of approximate message passing
for sparse and robust regression [20.830017611900832]
This paper develops non-asymptotic distributional characterizations for approximate message passing (AMP)
AMP is a family of iterative algorithms that prove effective as both fast estimators and powerful theoretical machinery.
arXiv Detail & Related papers (2024-01-08T14:34:35Z) - Identifiability and Asymptotics in Learning Homogeneous Linear ODE Systems from Discrete Observations [114.17826109037048]
Ordinary Differential Equations (ODEs) have recently gained a lot of attention in machine learning.
theoretical aspects, e.g., identifiability and properties of statistical estimation are still obscure.
This paper derives a sufficient condition for the identifiability of homogeneous linear ODE systems from a sequence of equally-spaced error-free observations sampled from a single trajectory.
arXiv Detail & Related papers (2022-10-12T06:46:38Z) - Off-policy estimation of linear functionals: Non-asymptotic theory for
semi-parametric efficiency [59.48096489854697]
The problem of estimating a linear functional based on observational data is canonical in both the causal inference and bandit literatures.
We prove non-asymptotic upper bounds on the mean-squared error of such procedures.
We establish its instance-dependent optimality in finite samples via matching non-asymptotic local minimax lower bounds.
arXiv Detail & Related papers (2022-09-26T23:50:55Z) - Double Descent in Random Feature Models: Precise Asymptotic Analysis for
General Convex Regularization [4.8900735721275055]
We provide precise expressions for the generalization of regression under a broad class of convex regularization terms.
We numerically demonstrate the predictive capacity of our framework, and show experimentally that the predicted test error is accurate even in the non-asymptotic regime.
arXiv Detail & Related papers (2022-04-06T08:59:38Z) - Nonconvex Stochastic Scaled-Gradient Descent and Generalized Eigenvector
Problems [98.34292831923335]
Motivated by the problem of online correlation analysis, we propose the emphStochastic Scaled-Gradient Descent (SSD) algorithm.
We bring these ideas together in an application to online correlation analysis, deriving for the first time an optimal one-time-scale algorithm with an explicit rate of local convergence to normality.
arXiv Detail & Related papers (2021-12-29T18:46:52Z) - Heavy-tailed Streaming Statistical Estimation [58.70341336199497]
We consider the task of heavy-tailed statistical estimation given streaming $p$ samples.
We design a clipped gradient descent and provide an improved analysis under a more nuanced condition on the noise of gradients.
arXiv Detail & Related papers (2021-08-25T21:30:27Z) - Learning Gaussian Mixtures with Generalised Linear Models: Precise
Asymptotics in High-dimensions [79.35722941720734]
Generalised linear models for multi-class classification problems are one of the fundamental building blocks of modern machine learning tasks.
We prove exacts characterising the estimator in high-dimensions via empirical risk minimisation.
We discuss how our theory can be applied beyond the scope of synthetic data.
arXiv Detail & Related papers (2021-06-07T16:53:56Z) - Asymptotic Analysis of an Ensemble of Randomly Projected Linear
Discriminants [94.46276668068327]
In [1], an ensemble of randomly projected linear discriminants is used to classify datasets.
We develop a consistent estimator of the misclassification probability as an alternative to the computationally-costly cross-validation estimator.
We also demonstrate the use of our estimator for tuning the projection dimension on both real and synthetic data.
arXiv Detail & Related papers (2020-04-17T12:47:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.