Last Layer Marginal Likelihood for Invariance Learning
- URL: http://arxiv.org/abs/2106.07512v1
- Date: Mon, 14 Jun 2021 15:40:51 GMT
- Title: Last Layer Marginal Likelihood for Invariance Learning
- Authors: Pola Elisabeth Schw\"obel, Martin J{\o}rgensen, Sebastian W. Ober,
Mark van der Wilk
- Abstract summary: We introduce a new lower bound to the marginal likelihood, which allows us to perform inference for a larger class of likelihood functions.
We work towards bringing this approach to neural networks by using an architecture with a Gaussian process in the last layer.
- Score: 12.00078928875924
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Data augmentation is often used to incorporate inductive biases into models.
Traditionally, these are hand-crafted and tuned with cross validation. The
Bayesian paradigm for model selection provides a path towards end-to-end
learning of invariances using only the training data, by optimising the
marginal likelihood. We work towards bringing this approach to neural networks
by using an architecture with a Gaussian process in the last layer, a model for
which the marginal likelihood can be computed. Experimentally, we improve
performance by learning appropriate invariances in standard benchmarks, the low
data regime and in a medical imaging task. Optimisation challenges for
invariant Deep Kernel Gaussian processes are identified, and a systematic
analysis is presented to arrive at a robust training scheme. We introduce a new
lower bound to the marginal likelihood, which allows us to perform inference
for a larger class of likelihood functions than before, thereby overcoming some
of the training challenges that existed with previous approaches.
Related papers
- A Diffusion Model Framework for Unsupervised Neural Combinatorial Optimization [7.378582040635655]
Current deep learning approaches rely on generative models that yield exact sample likelihoods.
This work introduces a method that lifts this restriction and opens the possibility to employ highly expressive latent variable models.
We experimentally validate our approach in data-free Combinatorial Optimization and demonstrate that our method achieves a new state-of-the-art on a wide range of benchmark problems.
arXiv Detail & Related papers (2024-06-03T17:55:02Z) - Trustworthy Personalized Bayesian Federated Learning via Posterior
Fine-Tune [3.1001287855313966]
We introduce a novel framework for personalized federated learning, incorporating Bayesian methodology.
We show that the new algorithm not only improves accuracy but also outperforms the baseline significantly in OOD detection.
arXiv Detail & Related papers (2024-02-25T13:28:08Z) - Out of the Ordinary: Spectrally Adapting Regression for Covariate Shift [12.770658031721435]
We propose a method for adapting the weights of the last layer of a pre-trained neural regression model to perform better on input data originating from a different distribution.
We demonstrate how this lightweight spectral adaptation procedure can improve out-of-distribution performance for synthetic and real-world datasets.
arXiv Detail & Related papers (2023-12-29T04:15:58Z) - Amortised Inference in Bayesian Neural Networks [0.0]
We introduce the Amortised Pseudo-Observation Variational Inference Bayesian Neural Network (APOVI-BNN)
We show that the amortised inference is of similar or better quality to those obtained through traditional variational inference.
We then discuss how the APOVI-BNN may be viewed as a new member of the neural process family.
arXiv Detail & Related papers (2023-09-06T14:02:33Z) - End-to-End Meta-Bayesian Optimisation with Transformer Neural Processes [52.818579746354665]
This paper proposes the first end-to-end differentiable meta-BO framework that generalises neural processes to learn acquisition functions via transformer architectures.
We enable this end-to-end framework with reinforcement learning (RL) to tackle the lack of labelled acquisition data.
arXiv Detail & Related papers (2023-05-25T10:58:46Z) - Invariance Learning in Deep Neural Networks with Differentiable Laplace
Approximations [76.82124752950148]
We develop a convenient gradient-based method for selecting the data augmentation.
We use a differentiable Kronecker-factored Laplace approximation to the marginal likelihood as our objective.
arXiv Detail & Related papers (2022-02-22T02:51:11Z) - Gone Fishing: Neural Active Learning with Fisher Embeddings [55.08537975896764]
There is an increasing need for active learning algorithms that are compatible with deep neural networks.
This article introduces BAIT, a practical representation of tractable, and high-performing active learning algorithm for neural networks.
arXiv Detail & Related papers (2021-06-17T17:26:31Z) - Scalable Marginal Likelihood Estimation for Model Selection in Deep
Learning [78.83598532168256]
Marginal-likelihood based model-selection is rarely used in deep learning due to estimation difficulties.
Our work shows that marginal likelihoods can improve generalization and be useful when validation data is unavailable.
arXiv Detail & Related papers (2021-04-11T09:50:24Z) - A Bayesian Perspective on Training Speed and Model Selection [51.15664724311443]
We show that a measure of a model's training speed can be used to estimate its marginal likelihood.
We verify our results in model selection tasks for linear models and for the infinite-width limit of deep neural networks.
Our results suggest a promising new direction towards explaining why neural networks trained with gradient descent are biased towards functions that generalize well.
arXiv Detail & Related papers (2020-10-27T17:56:14Z) - Extrapolation for Large-batch Training in Deep Learning [72.61259487233214]
We show that a host of variations can be covered in a unified framework that we propose.
We prove the convergence of this novel scheme and rigorously evaluate its empirical performance on ResNet, LSTM, and Transformer.
arXiv Detail & Related papers (2020-06-10T08:22:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.