Related papers: Likelihood-Free Inference with Generative Neural Networks via Scoring Rule Minimization

Likelihood-Free Inference with Generative Neural Networks via Scoring Rule Minimization

URL: http://arxiv.org/abs/2205.15784v1
Date: Tue, 31 May 2022 13:32:55 GMT
Title: Likelihood-Free Inference with Generative Neural Networks via Scoring Rule Minimization
Authors: Lorenzo Pacchiardi and Ritabrata Dutta
Abstract summary: Inference methods yield posterior approximations for simulator models with intractable likelihood. Many works trained neural networks to approximate either the intractable likelihood or the posterior directly. Here, we propose to approximate the posterior with generative networks trained by Scoring Rule minimization.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Bayesian Likelihood-Free Inference methods yield posterior approximations for simulator models with intractable likelihood. Recently, many works trained neural networks to approximate either the intractable likelihood or the posterior directly. Most proposals use normalizing flows, namely neural networks parametrizing invertible maps used to transform samples from an underlying base measure; the probability density of the transformed samples is then accessible and the normalizing flow can be trained via maximum likelihood on simulated parameter-observation pairs. A recent work [Ramesh et al., 2022] approximated instead the posterior with generative networks, which drop the invertibility requirement and are thus a more flexible class of distributions scaling to high-dimensional and structured data. However, generative networks only allow sampling from the parametrized distribution; for this reason, Ramesh et al. [2022] follows the common solution of adversarial training, where the generative network plays a min-max game against a "critic" network. This procedure is unstable and can lead to a learned distribution underestimating the uncertainty - in extreme cases collapsing to a single point. Here, we propose to approximate the posterior with generative networks trained by Scoring Rule minimization, an overlooked adversarial-free method enabling smooth training and better uncertainty quantification. In simulation studies, the Scoring Rule approach yields better performances with shorter training time with respect to the adversarial framework.

Related papers

Walking on the Fiber: A Simple Geometric Approximation for Bayesian Neural Networks [14.632351275859696]
In this work, we revisit sampling techniques for posterior exploration.<n>We introduce a model that learns a deformation of the parameter space, enabling rapid posterior sampling without requiring iterative methods.<n> Empirical results demonstrate that our approach achieves competitive posterior approximations.
arXiv Detail & Related papers (2025-12-01T10:24:10Z)
Training of Spiking Neural Networks with Expectation-Propagation [9.24888258922809]
We propose a unifying message-passing framework for training spiking neural networks (SNNs)<n>Our gradient-free method is capable of learning the marginal distributions of network parameters and simultaneously marginalizes parameters, such as the outputs of hidden layers.
arXiv Detail & Related papers (2025-06-30T11:59:56Z)
A Principled Bayesian Framework for Training Binary and Spiking Neural Networks [1.6658912537684454]
Spiking Bayesian Neural Networks (SBNNs) is a variational inference framework that uses posterior noise to train Binary and Spiking Neural Networks with IW-ST.<n>By linking low-bias conditions, vanishing gradients, and the KL term, we enable training of deep residual networks without normalisation.
arXiv Detail & Related papers (2025-05-23T14:33:20Z)
Scalable Bayesian Inference in the Era of Deep Learning: From Gaussian Processes to Deep Neural Networks [0.5827521884806072]
Large neural networks trained on large datasets have become the dominant paradigm in machine learning. This thesis develops scalable methods to equip neural networks with model uncertainty.
arXiv Detail & Related papers (2024-04-29T23:38:58Z)
Sampling weights of deep neural networks [1.2370077627846041]
We introduce a probability distribution, combined with an efficient sampling algorithm, for weights and biases of fully-connected neural networks. In a supervised learning context, no iterative optimization or gradient computations of internal network parameters are needed. We prove that sampled networks are universal approximators.
arXiv Detail & Related papers (2023-06-29T10:13:36Z)
Improved uncertainty quantification for neural networks with Bayesian last layer [0.0]
Uncertainty quantification is an important task in machine learning. We present a reformulation of the log-marginal likelihood of a NN with BLL which allows for efficient training using backpropagation.
arXiv Detail & Related papers (2023-02-21T20:23:56Z)
An unfolding method based on conditional Invertible Neural Networks (cINN) using iterative training [0.0]
Generative networks like invertible neural networks(INN) enable a probabilistic unfolding. We introduce the iterative conditional INN(IcINN) for unfolding that adjusts for deviations between simulated training samples and data.
arXiv Detail & Related papers (2022-12-16T19:00:05Z)
GFlowOut: Dropout with Generative Flow Networks [76.59535235717631]
Monte Carlo Dropout has been widely used as a relatively cheap way for approximate Inference. Recent works show that the dropout mask can be viewed as a latent variable, which can be inferred with variational inference. GFlowOutleverages the recently proposed probabilistic framework of Generative Flow Networks (GFlowNets) to learn the posterior distribution over dropout masks.
arXiv Detail & Related papers (2022-10-24T03:00:01Z)
Implicit Bias in Leaky ReLU Networks Trained on High-Dimensional Data [63.34506218832164]
In this work, we investigate the implicit bias of gradient flow and gradient descent in two-layer fully-connected neural networks with ReLU activations. For gradient flow, we leverage recent work on the implicit bias for homogeneous neural networks to show that leakyally, gradient flow produces a neural network with rank at most two. For gradient descent, provided the random variance is small enough, we show that a single step of gradient descent suffices to drastically reduce the rank of the network, and that the rank remains small throughout training.
arXiv Detail & Related papers (2022-10-13T15:09:54Z)
Layer Ensembles [95.42181254494287]
We introduce a method for uncertainty estimation that considers a set of independent categorical distributions for each layer of the network. We show that the method can be further improved by ranking samples, resulting in models that require less memory and time to run.
arXiv Detail & Related papers (2022-10-10T17:52:47Z)
Bayesian Nested Neural Networks for Uncertainty Calibration and Adaptive Compression [40.35734017517066]
Nested networks or slimmable networks are neural networks whose architectures can be adjusted instantly during testing time. Recent studies have focused on a "nested dropout" layer, which is able to order the nodes of a layer by importance during training.
arXiv Detail & Related papers (2021-01-27T12:34:58Z)
A Bayesian Perspective on Training Speed and Model Selection [51.15664724311443]
We show that a measure of a model's training speed can be used to estimate its marginal likelihood. We verify our results in model selection tasks for linear models and for the infinite-width limit of deep neural networks. Our results suggest a promising new direction towards explaining why neural networks trained with gradient descent are biased towards functions that generalize well.
arXiv Detail & Related papers (2020-10-27T17:56:14Z)
Compressive sensing with un-trained neural networks: Gradient descent finds the smoothest approximation [60.80172153614544]
Un-trained convolutional neural networks have emerged as highly successful tools for image recovery and restoration. We show that an un-trained convolutional neural network can approximately reconstruct signals and images that are sufficiently structured, from a near minimal number of random measurements.
arXiv Detail & Related papers (2020-05-07T15:57:25Z)
MSE-Optimal Neural Network Initialization via Layer Fusion [68.72356718879428]
Deep neural networks achieve state-of-the-art performance for a range of classification and inference tasks. The use of gradient combined nonvolutionity renders learning susceptible to novel problems. We propose fusing neighboring layers of deeper networks that are trained with random variables.
arXiv Detail & Related papers (2020-01-28T18:25:15Z)

This list is automatically generated from the titles and abstracts of the papers in this site.