Asymptotic Errors for Teacher-Student Convex Generalized Linear Models
(or : How to Prove Kabashima's Replica Formula)
- URL: http://arxiv.org/abs/2006.06581v6
- Date: Thu, 10 Nov 2022 16:57:22 GMT
- Title: Asymptotic Errors for Teacher-Student Convex Generalized Linear Models
(or : How to Prove Kabashima's Replica Formula)
- Authors: Cedric Gerbelot, Alia Abbara and Florent Krzakala
- Abstract summary: We prove an analytical formula for the reconstruction performance of convex generalized linear models.
We show that an analytical continuation may be carried out to extend the result to convex (non-strongly) problems.
We illustrate our claim with numerical examples on mainstream learning methods.
- Score: 23.15629681360836
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: There has been a recent surge of interest in the study of asymptotic
reconstruction performance in various cases of generalized linear estimation
problems in the teacher-student setting, especially for the case of i.i.d
standard normal matrices. Here, we go beyond these matrices, and prove an
analytical formula for the reconstruction performance of convex generalized
linear models with rotationally-invariant data matrices with arbitrary bounded
spectrum, rigorously confirming, under suitable assumptions, a conjecture
originally derived using the replica method from statistical physics. The proof
is achieved by leveraging on message passing algorithms and the statistical
properties of their iterates, allowing to characterize the asymptotic empirical
distribution of the estimator. For sufficiently strongly convex problems, we
show that the two-layer vector approximate message passing algorithm (2-MLVAMP)
converges, where the convergence analysis is done by checking the stability of
an equivalent dynamical system, which gives the result for such problems. We
then show that, under a concentration assumption, an analytical continuation
may be carried out to extend the result to convex (non-strongly) problems. We
illustrate our claim with numerical examples on mainstream learning methods
such as sparse logistic regression and linear support vector classifiers,
showing excellent agreement between moderate size simulation and the asymptotic
prediction.
Related papers
- Statistical Inference in Classification of High-dimensional Gaussian Mixture [1.2354076490479515]
We investigate the behavior of a general class of regularized convex classifiers in the high-dimensional limit.
Our focus is on the generalization error and variable selection properties of the estimators.
arXiv Detail & Related papers (2024-10-25T19:58:36Z) - Flat Minima in Linear Estimation and an Extended Gauss Markov Theorem [0.0]
We derive simple and explicit formulas for the optimal estimator in the cases of Nuclear and Spectral norms.
We analytically derive the generalization error in multiple random matrix ensembles, and compare with Ridge regression.
arXiv Detail & Related papers (2023-11-18T14:45:06Z) - Precise Asymptotics for Spectral Methods in Mixed Generalized Linear Models [31.58736590532443]
We consider the problem of estimating two statistically independent signals in a mixed generalized linear model.
Our characterization exploits a mix of tools from random matrices, free probability and the theory of approximate message passing algorithms.
arXiv Detail & Related papers (2022-11-21T11:35:25Z) - Adaptive Estimation of Graphical Models under Total Positivity [13.47131471222723]
We consider the problem of estimating (diagonally dominant) M-matrices as precision matrices in Gaussian graphical models.
We propose an adaptive multiple-stage estimation method that refines the estimate.
We develop a unified framework based on the gradient projection method to solve the regularized problem.
arXiv Detail & Related papers (2022-10-27T14:21:27Z) - Learning Graphical Factor Models with Riemannian Optimization [70.13748170371889]
This paper proposes a flexible algorithmic framework for graph learning under low-rank structural constraints.
The problem is expressed as penalized maximum likelihood estimation of an elliptical distribution.
We leverage geometries of positive definite matrices and positive semi-definite matrices of fixed rank that are well suited to elliptical models.
arXiv Detail & Related papers (2022-10-21T13:19:45Z) - Robust Regularized Low-Rank Matrix Models for Regression and
Classification [14.698622796774634]
We propose a framework for matrix variate regression models based on a rank constraint, vector regularization (e.g., sparsity), and a general loss function.
We show that the algorithm is guaranteed to converge, all accumulation points of the algorithm have estimation errors in the order of $O(sqrtn)$ally and substantially attaining the minimax rate.
arXiv Detail & Related papers (2022-05-14T18:03:48Z) - Joint Network Topology Inference via Structured Fusion Regularization [70.30364652829164]
Joint network topology inference represents a canonical problem of learning multiple graph Laplacian matrices from heterogeneous graph signals.
We propose a general graph estimator based on a novel structured fusion regularization.
We show that the proposed graph estimator enjoys both high computational efficiency and rigorous theoretical guarantee.
arXiv Detail & Related papers (2021-03-05T04:42:32Z) - Hessian Eigenspectra of More Realistic Nonlinear Models [73.31363313577941]
We make a emphprecise characterization of the Hessian eigenspectra for a broad family of nonlinear models.
Our analysis takes a step forward to identify the origin of many striking features observed in more complex machine learning models.
arXiv Detail & Related papers (2021-03-02T06:59:52Z) - Understanding Implicit Regularization in Over-Parameterized Single Index
Model [55.41685740015095]
We design regularization-free algorithms for the high-dimensional single index model.
We provide theoretical guarantees for the induced implicit regularization phenomenon.
arXiv Detail & Related papers (2020-07-16T13:27:47Z) - Asymptotic Analysis of an Ensemble of Randomly Projected Linear
Discriminants [94.46276668068327]
In [1], an ensemble of randomly projected linear discriminants is used to classify datasets.
We develop a consistent estimator of the misclassification probability as an alternative to the computationally-costly cross-validation estimator.
We also demonstrate the use of our estimator for tuning the projection dimension on both real and synthetic data.
arXiv Detail & Related papers (2020-04-17T12:47:04Z) - Semiparametric Nonlinear Bipartite Graph Representation Learning with
Provable Guarantees [106.91654068632882]
We consider the bipartite graph and formalize its representation learning problem as a statistical estimation problem of parameters in a semiparametric exponential family distribution.
We show that the proposed objective is strongly convex in a neighborhood around the ground truth, so that a gradient descent-based method achieves linear convergence rate.
Our estimator is robust to any model misspecification within the exponential family, which is validated in extensive experiments.
arXiv Detail & Related papers (2020-03-02T16:40:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.