Learning Expressive Priors for Generalization and Uncertainty Estimation
in Neural Networks
- URL: http://arxiv.org/abs/2307.07753v1
- Date: Sat, 15 Jul 2023 09:24:33 GMT
- Title: Learning Expressive Priors for Generalization and Uncertainty Estimation
in Neural Networks
- Authors: Dominik Schnaus, Jongseok Lee, Daniel Cremers, Rudolph Triebel
- Abstract summary: We propose a novel prior learning method for advancing generalization and uncertainty estimation in deep neural networks.
The key idea is to exploit scalable and structured posteriors of neural networks as informative priors with generalization guarantees.
We exhaustively show the effectiveness of this method for uncertainty estimation and generalization.
- Score: 77.89179552509887
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In this work, we propose a novel prior learning method for advancing
generalization and uncertainty estimation in deep neural networks. The key idea
is to exploit scalable and structured posteriors of neural networks as
informative priors with generalization guarantees. Our learned priors provide
expressive probabilistic representations at large scale, like Bayesian
counterparts of pre-trained models on ImageNet, and further produce non-vacuous
generalization bounds. We also extend this idea to a continual learning
framework, where the favorable properties of our priors are desirable. Major
enablers are our technical contributions: (1) the sums-of-Kronecker-product
computations, and (2) the derivations and optimizations of tractable objectives
that lead to improved generalization bounds. Empirically, we exhaustively show
the effectiveness of this method for uncertainty estimation and generalization.
Related papers
- Learning via Surrogate PAC-Bayes [13.412960492870996]
PAC-Bayes learning is a comprehensive setting for studying the generalisation ability of learning algorithms.
We introduce a novel principled strategy for building an iterative learning algorithm via the optimisation of a sequence of surrogate training objectives.
On top of providing that generic recipe for learning via surrogate PAC-Bayes bounds, we (i) contribute theoretical results establishing that iteratively optimising our surrogates implies the optimisation of the original generalisation bounds.
arXiv Detail & Related papers (2024-10-14T07:45:50Z) - On the Generalization Ability of Unsupervised Pretraining [53.06175754026037]
Recent advances in unsupervised learning have shown that unsupervised pre-training, followed by fine-tuning, can improve model generalization.
This paper introduces a novel theoretical framework that illuminates the critical factor influencing the transferability of knowledge acquired during unsupervised pre-training to the subsequent fine-tuning phase.
Our results contribute to a better understanding of unsupervised pre-training and fine-tuning paradigm, and can shed light on the design of more effective pre-training algorithms.
arXiv Detail & Related papers (2024-03-11T16:23:42Z) - Function-Space Regularization in Neural Networks: A Probabilistic
Perspective [51.133793272222874]
We show that we can derive a well-motivated regularization technique that allows explicitly encoding information about desired predictive functions into neural network training.
We evaluate the utility of this regularization technique empirically and demonstrate that the proposed method leads to near-perfect semantic shift detection and highly-calibrated predictive uncertainty estimates.
arXiv Detail & Related papers (2023-12-28T17:50:56Z) - Sparsity-aware generalization theory for deep neural networks [12.525959293825318]
We present a new approach to analyzing generalization for deep feed-forward ReLU networks.
We show fundamental trade-offs between sparsity and generalization.
arXiv Detail & Related papers (2023-07-01T20:59:05Z) - TANGOS: Regularizing Tabular Neural Networks through Gradient
Orthogonalization and Specialization [69.80141512683254]
We introduce Tabular Neural Gradient Orthogonalization and gradient (TANGOS)
TANGOS is a novel framework for regularization in the tabular setting built on latent unit attributions.
We demonstrate that our approach can lead to improved out-of-sample generalization performance, outperforming other popular regularization methods.
arXiv Detail & Related papers (2023-03-09T18:57:13Z) - Generalized Uncertainty of Deep Neural Networks: Taxonomy and
Applications [1.9671123873378717]
We show that the uncertainty of deep neural networks is not only important in a sense of interpretability and transparency, but also crucial in further advancing their performance.
We will generalize the definition of the uncertainty of deep neural networks to any number or vector that is associated with an input or an input-label pair, and catalog existing methods on mining'' such uncertainty from a deep model.
arXiv Detail & Related papers (2023-02-02T22:02:33Z) - On the generalization of learning algorithms that do not converge [54.122745736433856]
Generalization analyses of deep learning typically assume that the training converges to a fixed point.
Recent results indicate that in practice, the weights of deep neural networks optimized with gradient descent often oscillate indefinitely.
arXiv Detail & Related papers (2022-08-16T21:22:34Z) - Variational Structured Attention Networks for Deep Visual Representation
Learning [49.80498066480928]
We propose a unified deep framework to jointly learn both spatial attention maps and channel attention in a principled manner.
Specifically, we integrate the estimation and the interaction of the attentions within a probabilistic representation learning framework.
We implement the inference rules within the neural network, thus allowing for end-to-end learning of the probabilistic and the CNN front-end parameters.
arXiv Detail & Related papers (2021-03-05T07:37:24Z) - Revisiting Explicit Regularization in Neural Networks for
Well-Calibrated Predictive Uncertainty [6.09170287691728]
In this work, we revisit the importance of explicit regularization for obtaining well-calibrated predictive uncertainty.
We introduce a measure of calibration performance, which is lower bounded by the log-likelihood.
We then explore explicit regularization techniques for improving the log-likelihood on unseen samples, which provides well-calibrated predictive uncertainty.
arXiv Detail & Related papers (2020-06-11T13:14:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.