A Non-Asymptotic Moreau Envelope Theory for High-Dimensional Generalized
Linear Models
- URL: http://arxiv.org/abs/2210.12082v1
- Date: Fri, 21 Oct 2022 16:16:55 GMT
- Title: A Non-Asymptotic Moreau Envelope Theory for High-Dimensional Generalized
Linear Models
- Authors: Lijia Zhou and Frederic Koehler and Pragya Sur and Danica J.
Sutherland and Nathan Srebro
- Abstract summary: We prove a new generalization bound that shows for any class of linear predictors in Gaussian space.
We use our finite-sample bound to directly recover the "optimistic rate" of Zhou et al. (2021)
We show that application of our bound generalization using localized Gaussian width will generally be sharp for empirical risk minimizers.
- Score: 33.36787620121057
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We prove a new generalization bound that shows for any class of linear
predictors in Gaussian space, the Rademacher complexity of the class and the
training error under any continuous loss $\ell$ can control the test error
under all Moreau envelopes of the loss $\ell$. We use our finite-sample bound
to directly recover the "optimistic rate" of Zhou et al. (2021) for linear
regression with the square loss, which is known to be tight for minimal
$\ell_2$-norm interpolation, but we also handle more general settings where the
label is generated by a potentially misspecified multi-index model. The same
argument can analyze noisy interpolation of max-margin classifiers through the
squared hinge loss, and establishes consistency results in spiked-covariance
settings. More generally, when the loss is only assumed to be Lipschitz, our
bound effectively improves Talagrand's well-known contraction lemma by a factor
of two, and we prove uniform convergence of interpolators (Koehler et al. 2021)
for all smooth, non-negative losses. Finally, we show that application of our
generalization bound using localized Gaussian width will generally be sharp for
empirical risk minimizers, establishing a non-asymptotic Moreau envelope theory
for generalization that applies outside of proportional scaling regimes,
handles model misspecification, and complements existing asymptotic Moreau
envelope theories for M-estimation.
Related papers
Err
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.