Rényi Neural Processes
- URL: http://arxiv.org/abs/2405.15991v1
- Date: Sat, 25 May 2024 00:14:55 GMT
- Title: Rényi Neural Processes
- Authors: Xuesong Wang, He Zhao, Edwin V. Bonilla,
- Abstract summary: We propose R'enyi Neural Processes (RNP) to relax the influence of the misspecified prior.
By replacing the standard KL divergence with the R'enyi divergence between the posterior and the approximated prior, we ameliorate the impact of the misspecified prior.
Our experiments showed log-likelihood improvements on several existing NP families.
- Score: 14.11793373584558
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Neural Processes (NPs) are variational frameworks that aim to represent stochastic processes with deep neural networks. Despite their obvious benefits in uncertainty estimation for complex distributions via data-driven priors, NPs enforce network parameter sharing between the conditional prior and posterior distributions, thereby risking introducing a misspecified prior. We hereby propose R\'enyi Neural Processes (RNP) to relax the influence of the misspecified prior and optimize a tighter bound of the marginal likelihood. More specifically, by replacing the standard KL divergence with the R\'enyi divergence between the posterior and the approximated prior, we ameliorate the impact of the misspecified prior via a parameter {\alpha} so that the resulting posterior focuses more on tail samples and reduce density on overconfident regions. Our experiments showed log-likelihood improvements on several existing NP families. We demonstrated the superior performance of our approach on various benchmarks including regression and image inpainting tasks. We also validate the effectiveness of RNPs on real-world tabular regression problems.
Related papers
- Reparameterization invariance in approximate Bayesian inference [32.88960624085645]
We develop a new geometric view of reparametrizations from which we explain the success of linearization.
We demonstrate that these re parameterization invariance properties can be extended to the original neural network predictive.
arXiv Detail & Related papers (2024-06-05T14:49:15Z) - Tractable Function-Space Variational Inference in Bayesian Neural
Networks [72.97620734290139]
A popular approach for estimating the predictive uncertainty of neural networks is to define a prior distribution over the network parameters.
We propose a scalable function-space variational inference method that allows incorporating prior information.
We show that the proposed method leads to state-of-the-art uncertainty estimation and predictive performance on a range of prediction tasks.
arXiv Detail & Related papers (2023-12-28T18:33:26Z) - Domain Generalization Guided by Gradient Signal to Noise Ratio of
Parameters [69.24377241408851]
Overfitting to the source domain is a common issue in gradient-based training of deep neural networks.
We propose to base the selection on gradient-signal-to-noise ratio (GSNR) of network's parameters.
arXiv Detail & Related papers (2023-10-11T10:21:34Z) - Neural Diffusion Processes [12.744250155946503]
We propose Neural Diffusion Processes (NDPs), a novel approach that learns to sample from a rich distribution over functions through its finite marginals.
We empirically show that NDPs can capture functional distributions close to the true Bayesian posterior.
NDPs enable a variety of downstream tasks, including regression, implicit hyper marginalisation, non-Gaussian posterior prediction and global optimisation.
arXiv Detail & Related papers (2022-06-08T16:13:04Z) - How do noise tails impact on deep ReLU networks? [2.5889847253961418]
We show how the optimal rate of convergence depends on p, the degree of smoothness and the intrinsic dimension in a class of nonparametric regression functions.
We also contribute some new results on the approximation theory of deep ReLU neural networks.
arXiv Detail & Related papers (2022-03-20T00:27:32Z) - Sampling-free Variational Inference for Neural Networks with
Multiplicative Activation Noise [51.080620762639434]
We propose a more efficient parameterization of the posterior approximation for sampling-free variational inference.
Our approach yields competitive results for standard regression problems and scales well to large-scale image classification tasks.
arXiv Detail & Related papers (2021-03-15T16:16:18Z) - On Signal-to-Noise Ratio Issues in Variational Inference for Deep
Gaussian Processes [55.62520135103578]
We show that the gradient estimates used in training Deep Gaussian Processes (DGPs) with importance-weighted variational inference are susceptible to signal-to-noise ratio (SNR) issues.
We show that our fix can lead to consistent improvements in the predictive performance of DGP models.
arXiv Detail & Related papers (2020-11-01T14:38:02Z) - NP-PROV: Neural Processes with Position-Relevant-Only Variances [113.20013269514327]
We present a new member named Neural Processes with Position-Relevant-Only Variances (NP-PROV)
NP-PROV hypothesizes that a target point close to a context point has small uncertainty, regardless of the function value at that position.
Our evaluation on synthetic and real-world datasets reveals that NP-PROV can achieve state-of-the-art likelihood while retaining a bounded variance.
arXiv Detail & Related papers (2020-06-15T06:11:21Z) - Revisiting Initialization of Neural Networks [72.24615341588846]
We propose a rigorous estimation of the global curvature of weights across layers by approximating and controlling the norm of their Hessian matrix.
Our experiments on Word2Vec and the MNIST/CIFAR image classification tasks confirm that tracking the Hessian norm is a useful diagnostic tool.
arXiv Detail & Related papers (2020-04-20T18:12:56Z) - Continual Learning with Extended Kronecker-factored Approximate
Curvature [33.44290346786496]
We propose a quadratic penalty method for continual learning of neural networks that contain batch normalization layers.
A Kronecker-factored approximate curvature (K-FAC) is used widely to practically compute the Hessian of a neural network.
We extend the K-FAC method so that the inter-example relations are taken into account and the Hessian of deep neural networks can be properly approximated.
arXiv Detail & Related papers (2020-04-16T07:58:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.