On Stein Variational Neural Network Ensembles
- URL: http://arxiv.org/abs/2106.10760v2
- Date: Tue, 22 Jun 2021 07:53:17 GMT
- Title: On Stein Variational Neural Network Ensembles
- Authors: Francesco D'Angelo, Vincent Fortuin, Florian Wenzel
- Abstract summary: In this work, we study different Stein variational gradient descent (SVGD) methods operating in the weight space, function space, and in a hybrid setting.
We find that SVGD using functional and hybrid kernels can overcome the limitations of deep ensembles.
- Score: 8.178886940116035
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Ensembles of deep neural networks have achieved great success recently, but
they do not offer a proper Bayesian justification. Moreover, while they allow
for averaging of predictions over several hypotheses, they do not provide any
guarantees for their diversity, leading to redundant solutions in function
space. In contrast, particle-based inference methods, such as Stein variational
gradient descent (SVGD), offer a Bayesian framework, but rely on the choice of
a kernel to measure the similarity between ensemble members. In this work, we
study different SVGD methods operating in the weight space, function space, and
in a hybrid setting. We compare the SVGD approaches to other ensembling-based
methods in terms of their theoretical properties and assess their empirical
performance on synthetic and real-world tasks. We find that SVGD using
functional and hybrid kernels can overcome the limitations of deep ensembles.
It improves on functional diversity and uncertainty estimation and approaches
the true Bayesian posterior more closely. Moreover, we show that using
stochastic SVGD updates, as opposed to the standard deterministic ones, can
further improve the performance.
Related papers
- Accelerating Convergence of Stein Variational Gradient Descent via Deep
Unfolding [5.584060970507506]
Stein variational gradient descent (SVGD) is a prominent particle-based variational inference method used for sampling a target distribution.
In this paper, we propose novel trainable algorithms that incorporate a deep-learning technique called deep unfolding,into SVGD.
arXiv Detail & Related papers (2024-02-23T06:24:57Z) - Deep Diversity-Enhanced Feature Representation of Hyperspectral Images [87.47202258194719]
We rectify 3D convolution by modifying its topology to enhance the rank upper-bound.
We also propose a novel diversity-aware regularization (DA-Reg) term that acts on the feature maps to maximize independence among elements.
To demonstrate the superiority of the proposed Re$3$-ConvSet and DA-Reg, we apply them to various HS image processing and analysis tasks.
arXiv Detail & Related papers (2023-01-15T16:19:18Z) - Message Passing Neural PDE Solvers [60.77761603258397]
We build a neural message passing solver, replacing allally designed components in the graph with backprop-optimized neural function approximators.
We show that neural message passing solvers representationally contain some classical methods, such as finite differences, finite volumes, and WENO schemes.
We validate our method on various fluid-like flow problems, demonstrating fast, stable, and accurate performance across different domain topologies, equation parameters, discretizations, etc., in 1D and 2D.
arXiv Detail & Related papers (2022-02-07T17:47:46Z) - Grassmann Stein Variational Gradient Descent [3.644031721554146]
Stein variational gradient descent (SVGD) is a deterministic particle inference algorithm that provides an efficient alternative to Markov chain Monte Carlo.
Recent developments have advocated projecting both the score function and the data onto real lines to sidestep this issue.
We propose Grassmann Stein variational gradient descent (GSVGD) as an alternative approach, which permits projections onto arbitrary dimensional subspaces.
arXiv Detail & Related papers (2022-02-07T15:36:03Z) - Scaling Structured Inference with Randomization [64.18063627155128]
We propose a family of dynamic programming (RDP) randomized for scaling structured models to tens of thousands of latent states.
Our method is widely applicable to classical DP-based inference.
It is also compatible with automatic differentiation so can be integrated with neural networks seamlessly.
arXiv Detail & Related papers (2021-12-07T11:26:41Z) - Neural Variational Gradient Descent [6.414093278187509]
Particle-based approximate Bayesian inference approaches such as Stein Variational Gradient Descent (SVGD) combine the flexibility and convergence guarantees of sampling methods with the computational benefits of variational inference.
We propose Neural Neural Variational Gradient Descent (NVGD), which is based on parameterizing the witness function of the Stein discrepancy by a deep neural network whose parameters are learned in parallel to the inference, mitigating the necessity to make any kernel choices whatsoever.
arXiv Detail & Related papers (2021-07-22T15:10:50Z) - Dynamic Convolution for 3D Point Cloud Instance Segmentation [146.7971476424351]
We propose an approach to instance segmentation from 3D point clouds based on dynamic convolution.
We gather homogeneous points that have identical semantic categories and close votes for the geometric centroids.
The proposed approach is proposal-free, and instead exploits a convolution process that adapts to the spatial and semantic characteristics of each instance.
arXiv Detail & Related papers (2021-07-18T09:05:16Z) - Kernel Stein Generative Modeling [68.03537693810972]
Gradient Langevin Dynamics (SGLD) demonstrates impressive results with energy-based models on high-dimensional and complex data distributions.
Stein Variational Gradient Descent (SVGD) is a deterministic sampling algorithm that iteratively transports a set of particles to approximate a given distribution.
We propose noise conditional kernel SVGD (NCK-SVGD), that works in tandem with the recently introduced Noise Conditional Score Network estimator.
arXiv Detail & Related papers (2020-07-06T21:26:04Z) - Sliced Kernelized Stein Discrepancy [17.159499204595527]
Kernelized Stein discrepancy (KSD) is extensively used in goodness-of-fit tests and model learning.
We propose the sliced Stein discrepancy and its scalable and kernelized variants, which employ kernel-based test functions defined on the optimal one-dimensional projections.
For model learning, we show its advantages over existing Stein discrepancy baselines by training independent component analysis models with different discrepancies.
arXiv Detail & Related papers (2020-06-30T04:58:55Z) - Stein Variational Inference for Discrete Distributions [70.19352762933259]
We propose a simple yet general framework that transforms discrete distributions to equivalent piecewise continuous distributions.
Our method outperforms traditional algorithms such as Gibbs sampling and discontinuous Hamiltonian Monte Carlo.
We demonstrate that our method provides a promising tool for learning ensembles of binarized neural network (BNN)
In addition, such transform can be straightforwardly employed in gradient-free kernelized Stein discrepancy to perform goodness-of-fit (GOF) test on discrete distributions.
arXiv Detail & Related papers (2020-03-01T22:45:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.