Multinomial Logistic Regression: Asymptotic Normality on Null Covariates
in High-Dimensions
- URL: http://arxiv.org/abs/2305.17825v1
- Date: Sun, 28 May 2023 23:33:41 GMT
- Title: Multinomial Logistic Regression: Asymptotic Normality on Null Covariates
in High-Dimensions
- Authors: Kai Tan and Pierre C. Bellec
- Abstract summary: This paper investigates the distribution of the maximum-likelihood estimate (MLE) in multinomial logistic models in the high-dimensional regime where dimension and sample size are of the same order.
- Score: 11.69389391551085
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: This paper investigates the asymptotic distribution of the maximum-likelihood
estimate (MLE) in multinomial logistic models in the high-dimensional regime
where dimension and sample size are of the same order. While classical
large-sample theory provides asymptotic normality of the MLE under certain
conditions, such classical results are expected to fail in high-dimensions as
documented for the binary logistic case in the seminal work of Sur and Cand\`es
[2019]. We address this issue in classification problems with 3 or more
classes, by developing asymptotic normality and asymptotic chi-square results
for the multinomial logistic MLE (also known as cross-entropy minimizer) on
null covariates. Our theory leads to a new methodology to test the significance
of a given feature. Extensive simulation studies on synthetic data corroborate
these asymptotic results and confirm the validity of proposed p-values for
testing the significance of a given feature.
Related papers
- A non-asymptotic distributional theory of approximate message passing
for sparse and robust regression [20.830017611900832]
This paper develops non-asymptotic distributional characterizations for approximate message passing (AMP)
AMP is a family of iterative algorithms that prove effective as both fast estimators and powerful theoretical machinery.
arXiv Detail & Related papers (2024-01-08T14:34:35Z) - Identifiability and Asymptotics in Learning Homogeneous Linear ODE Systems from Discrete Observations [114.17826109037048]
Ordinary Differential Equations (ODEs) have recently gained a lot of attention in machine learning.
theoretical aspects, e.g., identifiability and properties of statistical estimation are still obscure.
This paper derives a sufficient condition for the identifiability of homogeneous linear ODE systems from a sequence of equally-spaced error-free observations sampled from a single trajectory.
arXiv Detail & Related papers (2022-10-12T06:46:38Z) - Off-policy estimation of linear functionals: Non-asymptotic theory for
semi-parametric efficiency [59.48096489854697]
The problem of estimating a linear functional based on observational data is canonical in both the causal inference and bandit literatures.
We prove non-asymptotic upper bounds on the mean-squared error of such procedures.
We establish its instance-dependent optimality in finite samples via matching non-asymptotic local minimax lower bounds.
arXiv Detail & Related papers (2022-09-26T23:50:55Z) - Double Descent in Random Feature Models: Precise Asymptotic Analysis for
General Convex Regularization [4.8900735721275055]
We provide precise expressions for the generalization of regression under a broad class of convex regularization terms.
We numerically demonstrate the predictive capacity of our framework, and show experimentally that the predicted test error is accurate even in the non-asymptotic regime.
arXiv Detail & Related papers (2022-04-06T08:59:38Z) - Nonconvex Stochastic Scaled-Gradient Descent and Generalized Eigenvector
Problems [98.34292831923335]
Motivated by the problem of online correlation analysis, we propose the emphStochastic Scaled-Gradient Descent (SSD) algorithm.
We bring these ideas together in an application to online correlation analysis, deriving for the first time an optimal one-time-scale algorithm with an explicit rate of local convergence to normality.
arXiv Detail & Related papers (2021-12-29T18:46:52Z) - Optimal regularizations for data generation with probabilistic graphical
models [0.0]
Empirically, well-chosen regularization schemes dramatically improve the quality of the inferred models.
We consider the particular case of L 2 and L 1 regularizations in the Maximum A Posteriori (MAP) inference of generative pairwise graphical models.
arXiv Detail & Related papers (2021-12-02T14:45:16Z) - Heavy-tailed Streaming Statistical Estimation [58.70341336199497]
We consider the task of heavy-tailed statistical estimation given streaming $p$ samples.
We design a clipped gradient descent and provide an improved analysis under a more nuanced condition on the noise of gradients.
arXiv Detail & Related papers (2021-08-25T21:30:27Z) - Learning Gaussian Mixtures with Generalised Linear Models: Precise
Asymptotics in High-dimensions [79.35722941720734]
Generalised linear models for multi-class classification problems are one of the fundamental building blocks of modern machine learning tasks.
We prove exacts characterising the estimator in high-dimensions via empirical risk minimisation.
We discuss how our theory can be applied beyond the scope of synthetic data.
arXiv Detail & Related papers (2021-06-07T16:53:56Z) - Asymptotic Errors for Teacher-Student Convex Generalized Linear Models
(or : How to Prove Kabashima's Replica Formula) [23.15629681360836]
We prove an analytical formula for the reconstruction performance of convex generalized linear models.
We show that an analytical continuation may be carried out to extend the result to convex (non-strongly) problems.
We illustrate our claim with numerical examples on mainstream learning methods.
arXiv Detail & Related papers (2020-06-11T16:26:35Z) - Asymptotic Analysis of an Ensemble of Randomly Projected Linear
Discriminants [94.46276668068327]
In [1], an ensemble of randomly projected linear discriminants is used to classify datasets.
We develop a consistent estimator of the misclassification probability as an alternative to the computationally-costly cross-validation estimator.
We also demonstrate the use of our estimator for tuning the projection dimension on both real and synthetic data.
arXiv Detail & Related papers (2020-04-17T12:47:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.