A PAC-Bayesian Link Between Generalisation and Flat Minima
- URL: http://arxiv.org/abs/2402.08508v1
- Date: Tue, 13 Feb 2024 15:03:02 GMT
- Title: A PAC-Bayesian Link Between Generalisation and Flat Minima
- Authors: Maxime Haddouche, Paul Viallard, Umut Simsekli, Benjamin Guedj
- Abstract summary: Modern machine learning usually involves predictors in the overparametrised setting.
This phenomenon challenges many theoretical results, and remains an open problem.
We provide novel generalisation bounds involving gradient terms.
- Score: 36.00919528944891
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Modern machine learning usually involves predictors in the overparametrised
setting (number of trained parameters greater than dataset size), and their
training yield not only good performances on training data, but also good
generalisation capacity. This phenomenon challenges many theoretical results,
and remains an open problem. To reach a better understanding, we provide novel
generalisation bounds involving gradient terms. To do so, we combine the
PAC-Bayes toolbox with Poincar\'e and Log-Sobolev inequalities, avoiding an
explicit dependency on dimension of the predictor space. Our results highlight
the positive influence of \emph{flat minima} (being minima with a neighbourhood
nearly minimising the learning problem as well) on generalisation performances,
involving directly the benefits of the optimisation phase.
Related papers
- A PAC-Bayesian Perspective on the Interpolating Information Criterion [54.548058449535155]
We show how a PAC-Bayes bound is obtained for a general class of models, characterizing factors which influence performance in the interpolating regime.
We quantify how the test error for overparameterized models achieving effectively zero training error depends on the quality of the implicit regularization imposed by e.g. the combination of model, parameter-initialization scheme.
arXiv Detail & Related papers (2023-11-13T01:48:08Z) - Fine-grained analysis of non-parametric estimation for pairwise learning [9.676007573960383]
We are concerned with the generalization performance of non-parametric estimation for pairwise learning.
Our results can be used to handle a wide range of pairwise learning problems including ranking, AUC, pairwise regression and metric and similarity learning.
arXiv Detail & Related papers (2023-05-31T08:13:14Z) - Improving Generalization of Complex Models under Unbounded Loss Using PAC-Bayes Bounds [10.94126149188336]
PAC-Bayes learning theory has focused extensively on establishing tight upper bounds for test errors.
A recently proposed training procedure called PAC-Bayes training, updates the model toward minimizing these bounds.
This approach is theoretically sound, in practice, it has not achieved a test error as low as those obtained by empirical risk minimization (ERM)
We introduce a new PAC-Bayes training algorithm with improved performance and reduced reliance on prior tuning.
arXiv Detail & Related papers (2023-05-30T17:31:25Z) - Federated Compositional Deep AUC Maximization [58.25078060952361]
We develop a novel federated learning method for imbalanced data by directly optimizing the area under curve (AUC) score.
To the best of our knowledge, this is the first work to achieve such favorable theoretical results.
arXiv Detail & Related papers (2023-04-20T05:49:41Z) - Scalable PAC-Bayesian Meta-Learning via the PAC-Optimal Hyper-Posterior:
From Theory to Practice [54.03076395748459]
A central question in the meta-learning literature is how to regularize to ensure generalization to unseen tasks.
We present a generalization bound for meta-learning, which was first derived by Rothfuss et al.
We provide a theoretical analysis and empirical case study under which conditions and to what extent these guarantees for meta-learning improve upon PAC-Bayesian per-task learning bounds.
arXiv Detail & Related papers (2022-11-14T08:51:04Z) - PAC-Bayes unleashed: generalisation bounds with unbounded losses [12.078257783674923]
We present new PAC-Bayesian generalisation bounds for learning problems with unbounded loss functions.
This extends the relevance and applicability of the PAC-Bayes learning framework.
arXiv Detail & Related papers (2020-06-12T15:55:46Z) - On the Benefits of Invariance in Neural Networks [56.362579457990094]
We show that training with data augmentation leads to better estimates of risk and thereof gradients, and we provide a PAC-Bayes generalization bound for models trained with data augmentation.
We also show that compared to data augmentation, feature averaging reduces generalization error when used with convex losses, and tightens PAC-Bayes bounds.
arXiv Detail & Related papers (2020-05-01T02:08:58Z) - PACOH: Bayes-Optimal Meta-Learning with PAC-Guarantees [77.67258935234403]
We provide a theoretical analysis using the PAC-Bayesian framework and derive novel generalization bounds for meta-learning.
We develop a class of PAC-optimal meta-learning algorithms with performance guarantees and a principled meta-level regularization.
arXiv Detail & Related papers (2020-02-13T15:01:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.