Related papers: Learning Expressive Priors for Generalization and Uncertainty Estimation in Neural Networks

Learning Expressive Priors for Generalization and Uncertainty Estimation in Neural Networks

URL: http://arxiv.org/abs/2307.07753v1
Date: Sat, 15 Jul 2023 09:24:33 GMT
Title: Learning Expressive Priors for Generalization and Uncertainty Estimation in Neural Networks
Authors: Dominik Schnaus, Jongseok Lee, Daniel Cremers, Rudolph Triebel
Abstract summary: We propose a novel prior learning method for advancing generalization and uncertainty estimation in deep neural networks. The key idea is to exploit scalable and structured posteriors of neural networks as informative priors with generalization guarantees. We exhaustively show the effectiveness of this method for uncertainty estimation and generalization.
Score: 77.89179552509887
License: http://creativecommons.org/licenses/by/4.0/
Abstract: In this work, we propose a novel prior learning method for advancing generalization and uncertainty estimation in deep neural networks. The key idea is to exploit scalable and structured posteriors of neural networks as informative priors with generalization guarantees. Our learned priors provide expressive probabilistic representations at large scale, like Bayesian counterparts of pre-trained models on ImageNet, and further produce non-vacuous generalization bounds. We also extend this idea to a continual learning framework, where the favorable properties of our priors are desirable. Major enablers are our technical contributions: (1) the sums-of-Kronecker-product computations, and (2) the derivations and optimizations of tractable objectives that lead to improved generalization bounds. Empirically, we exhaustively show the effectiveness of this method for uncertainty estimation and generalization.

Related papers

Learning via Surrogate PAC-Bayes [13.412960492870996]
PAC-Bayes learning is a comprehensive setting for studying the generalisation ability of learning algorithms. We introduce a novel principled strategy for building an iterative learning algorithm via the optimisation of a sequence of surrogate training objectives. On top of providing that generic recipe for learning via surrogate PAC-Bayes bounds, we (i) contribute theoretical results establishing that iteratively optimising our surrogates implies the optimisation of the original generalisation bounds.
arXiv Detail & Related papers (2024-10-14T07:45:50Z)
On the Generalization Ability of Unsupervised Pretraining [53.06175754026037]
Recent advances in unsupervised learning have shown that unsupervised pre-training, followed by fine-tuning, can improve model generalization. This paper introduces a novel theoretical framework that illuminates the critical factor influencing the transferability of knowledge acquired during unsupervised pre-training to the subsequent fine-tuning phase. Our results contribute to a better understanding of unsupervised pre-training and fine-tuning paradigm, and can shed light on the design of more effective pre-training algorithms.
arXiv Detail & Related papers (2024-03-11T16:23:42Z)
Function-Space Regularization in Neural Networks: A Probabilistic Perspective [51.133793272222874]
We show that we can derive a well-motivated regularization technique that allows explicitly encoding information about desired predictive functions into neural network training. We evaluate the utility of this regularization technique empirically and demonstrate that the proposed method leads to near-perfect semantic shift detection and highly-calibrated predictive uncertainty estimates.
arXiv Detail & Related papers (2023-12-28T17:50:56Z)
Sparsity-aware generalization theory for deep neural networks [12.525959293825318]
We present a new approach to analyzing generalization for deep feed-forward ReLU networks. We show fundamental trade-offs between sparsity and generalization.
arXiv Detail & Related papers (2023-07-01T20:59:05Z)
TANGOS: Regularizing Tabular Neural Networks through Gradient Orthogonalization and Specialization [69.80141512683254]
We introduce Tabular Neural Gradient Orthogonalization and gradient (TANGOS) TANGOS is a novel framework for regularization in the tabular setting built on latent unit attributions. We demonstrate that our approach can lead to improved out-of-sample generalization performance, outperforming other popular regularization methods.
arXiv Detail & Related papers (2023-03-09T18:57:13Z)
Generalized Uncertainty of Deep Neural Networks: Taxonomy and Applications [1.9671123873378717]
We show that the uncertainty of deep neural networks is not only important in a sense of interpretability and transparency, but also crucial in further advancing their performance. We will generalize the definition of the uncertainty of deep neural networks to any number or vector that is associated with an input or an input-label pair, and catalog existing methods on mining'' such uncertainty from a deep model.
arXiv Detail & Related papers (2023-02-02T22:02:33Z)
On the generalization of learning algorithms that do not converge [54.122745736433856]
Generalization analyses of deep learning typically assume that the training converges to a fixed point. Recent results indicate that in practice, the weights of deep neural networks optimized with gradient descent often oscillate indefinitely.
arXiv Detail & Related papers (2022-08-16T21:22:34Z)
Variational Structured Attention Networks for Deep Visual Representation Learning [49.80498066480928]
We propose a unified deep framework to jointly learn both spatial attention maps and channel attention in a principled manner. Specifically, we integrate the estimation and the interaction of the attentions within a probabilistic representation learning framework. We implement the inference rules within the neural network, thus allowing for end-to-end learning of the probabilistic and the CNN front-end parameters.
arXiv Detail & Related papers (2021-03-05T07:37:24Z)
Revisiting Explicit Regularization in Neural Networks for Well-Calibrated Predictive Uncertainty [6.09170287691728]
In this work, we revisit the importance of explicit regularization for obtaining well-calibrated predictive uncertainty. We introduce a measure of calibration performance, which is lower bounded by the log-likelihood. We then explore explicit regularization techniques for improving the log-likelihood on unseen samples, which provides well-calibrated predictive uncertainty.
arXiv Detail & Related papers (2020-06-11T13:14:01Z)

This list is automatically generated from the titles and abstracts of the papers in this site.