The Promises and Pitfalls of Deep Kernel Learning
- URL: http://arxiv.org/abs/2102.12108v1
- Date: Wed, 24 Feb 2021 07:56:49 GMT
- Title: The Promises and Pitfalls of Deep Kernel Learning
- Authors: Sebastian W. Ober, Carl E. Rasmussen, Mark van der Wilk
- Abstract summary: We identify pathological behavior, including overfitting, on a simple toy example.
We explore this pathology, explaining its origins and considering how it applies to real datasets.
We find that a fully Bayesian treatment of deep kernel learning can rectify this overfitting and obtain the desired performance improvements.
- Score: 13.487684503022063
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Deep kernel learning and related techniques promise to combine the
representational power of neural networks with the reliable uncertainty
estimates of Gaussian processes. One crucial aspect of these models is an
expectation that, because they are treated as Gaussian process models optimized
using the marginal likelihood, they are protected from overfitting. However, we
identify pathological behavior, including overfitting, on a simple toy example.
We explore this pathology, explaining its origins and considering how it
applies to real datasets. Through careful experimentation on UCI datasets,
CIFAR-10, and the UTKFace dataset, we find that the overfitting from
overparameterized deep kernel learning, in which the model is "somewhat
Bayesian", can in certain scenarios be worse than that from not being Bayesian
at all. However, we find that a fully Bayesian treatment of deep kernel
learning can rectify this overfitting and obtain the desired performance
improvements over standard neural networks and Gaussian processes.
Related papers
- Unrolled denoising networks provably learn optimal Bayesian inference [54.79172096306631]
We prove the first rigorous learning guarantees for neural networks based on unrolling approximate message passing (AMP)
For compressed sensing, we prove that when trained on data drawn from a product prior, the layers of the network converge to the same denoisers used in Bayes AMP.
arXiv Detail & Related papers (2024-09-19T17:56:16Z) - Scalable Bayesian Inference in the Era of Deep Learning: From Gaussian Processes to Deep Neural Networks [0.5827521884806072]
Large neural networks trained on large datasets have become the dominant paradigm in machine learning.
This thesis develops scalable methods to equip neural networks with model uncertainty.
arXiv Detail & Related papers (2024-04-29T23:38:58Z) - Equation Discovery with Bayesian Spike-and-Slab Priors and Efficient Kernels [57.46832672991433]
We propose a novel equation discovery method based on Kernel learning and BAyesian Spike-and-Slab priors (KBASS)
We use kernel regression to estimate the target function, which is flexible, expressive, and more robust to data sparsity and noises.
We develop an expectation-propagation expectation-maximization algorithm for efficient posterior inference and function estimation.
arXiv Detail & Related papers (2023-10-09T03:55:09Z) - Parallel and Limited Data Voice Conversion Using Stochastic Variational
Deep Kernel Learning [2.5782420501870296]
This paper proposes a voice conversion method that works with limited data.
It is based on variational deep kernel learning (SVDKL)
It is possible to estimate non-smooth and more complex functions.
arXiv Detail & Related papers (2023-09-08T16:32:47Z) - Promises and Pitfalls of the Linearized Laplace in Bayesian Optimization [73.80101701431103]
The linearized-Laplace approximation (LLA) has been shown to be effective and efficient in constructing Bayesian neural networks.
We study the usefulness of the LLA in Bayesian optimization and highlight its strong performance and flexibility.
arXiv Detail & Related papers (2023-04-17T14:23:43Z) - Guided Deep Kernel Learning [42.53025115287688]
We present a novel approach for learning deep kernels by utilizing infinite-width neural networks.
Our approach harnesses the reliable uncertainty estimation of the NNGPs to adapt the DKL target confidence when it encounters novel data points.
arXiv Detail & Related papers (2023-02-19T13:37:34Z) - MARS: Meta-Learning as Score Matching in the Function Space [79.73213540203389]
We present a novel approach to extracting inductive biases from a set of related datasets.
We use functional Bayesian neural network inference, which views the prior as a process and performs inference in the function space.
Our approach can seamlessly acquire and represent complex prior knowledge by metalearning the score function of the data-generating process.
arXiv Detail & Related papers (2022-10-24T15:14:26Z) - Can we achieve robustness from data alone? [0.7366405857677227]
Adversarial training and its variants have come to be the prevailing methods to achieve adversarially robust classification using neural networks.
We devise a meta-learning method for robust classification, that optimize the dataset prior to its deployment in a principled way.
Experiments on MNIST and CIFAR-10 demonstrate that the datasets we produce enjoy very high robustness against PGD attacks.
arXiv Detail & Related papers (2022-07-24T12:14:48Z) - Inducing Gaussian Process Networks [80.40892394020797]
We propose inducing Gaussian process networks (IGN), a simple framework for simultaneously learning the feature space as well as the inducing points.
The inducing points, in particular, are learned directly in the feature space, enabling a seamless representation of complex structured domains.
We report on experimental results for real-world data sets showing that IGNs provide significant advances over state-of-the-art methods.
arXiv Detail & Related papers (2022-04-21T05:27:09Z) - Evaluating State-of-the-Art Classification Models Against Bayes
Optimality [106.50867011164584]
We show that we can compute the exact Bayes error of generative models learned using normalizing flows.
We use our approach to conduct a thorough investigation of state-of-the-art classification models.
arXiv Detail & Related papers (2021-06-07T06:21:20Z) - Physics Informed Deep Kernel Learning [24.033468062984458]
Physics Informed Deep Kernel Learning (PI-DKL) exploits physics knowledge represented by differential equations with latent sources.
For efficient and effective inference, we marginalize out the latent variables and derive a collapsed model evidence lower bound (ELBO)
arXiv Detail & Related papers (2020-06-08T22:43:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.