A Scalable Gradient-Free Method for Bayesian Experimental Design with
Implicit Models
- URL: http://arxiv.org/abs/2103.08026v1
- Date: Sun, 14 Mar 2021 20:28:51 GMT
- Title: A Scalable Gradient-Free Method for Bayesian Experimental Design with
Implicit Models
- Authors: Jiaxin Zhang, Sirui Bi, Guannan Zhang
- Abstract summary: For implicit models, where the likelihood is intractable but sampling is possible, conventional BED methods have difficulties in efficiently estimating the posterior distribution.
Recent work proposed the use of gradient ascent to maximize a lower bound on MI to deal with these issues.
We propose a novel approach that leverages recent advances in approximate gradient ascent incorporated with a smoothed variational MI for efficient and robust BED.
- Score: 3.437223569602425
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Bayesian experimental design (BED) is to answer the question that how to
choose designs that maximize the information gathering. For implicit models,
where the likelihood is intractable but sampling is possible, conventional BED
methods have difficulties in efficiently estimating the posterior distribution
and maximizing the mutual information (MI) between data and parameters. Recent
work proposed the use of gradient ascent to maximize a lower bound on MI to
deal with these issues. However, the approach requires a sampling path to
compute the pathwise gradient of the MI lower bound with respect to the design
variables, and such a pathwise gradient is usually inaccessible for implicit
models. In this paper, we propose a novel approach that leverages recent
advances in stochastic approximate gradient ascent incorporated with a smoothed
variational MI estimator for efficient and robust BED. Without the necessity of
pathwise gradients, our approach allows the design process to be achieved
through a unified procedure with an approximate gradient for implicit models.
Several experiments show that our approach outperforms baseline methods, and
significantly improves the scalability of BED in high-dimensional problems.
Related papers
- Total Uncertainty Quantification in Inverse PDE Solutions Obtained with Reduced-Order Deep Learning Surrogate Models [50.90868087591973]
We propose an approximate Bayesian method for quantifying the total uncertainty in inverse PDE solutions obtained with machine learning surrogate models.
We test the proposed framework by comparing it with the iterative ensemble smoother and deep ensembling methods for a non-linear diffusion equation.
arXiv Detail & Related papers (2024-08-20T19:06:02Z) - Diagonalisation SGD: Fast & Convergent SGD for Non-Differentiable Models
via Reparameterisation and Smoothing [1.6114012813668932]
We introduce a simple framework to define non-differentiable functions piecewisely and present a systematic approach to obtain smoothings.
Our main contribution is a novel variant of SGD, Diagonalisation Gradient Descent, which progressively enhances the accuracy of the smoothed approximation.
Our approach is simple, fast stable and attains orders of magnitude reduction in work-normalised variance.
arXiv Detail & Related papers (2024-02-19T00:43:22Z) - Conflict-Averse Gradient Optimization of Ensembles for Effective Offline
Model-Based Optimization [0.0]
We evaluate two algorithms for combining gradient information: multiple gradient descent algorithm (MGDA) and conflict-averse gradient descent (CAGrad)
Our results suggest that MGDA and CAGrad strike a desirable balance between conservatism and optimality and can help robustify data-driven offline MBO without compromising optimality of designs.
arXiv Detail & Related papers (2023-03-31T10:00:27Z) - VI-DGP: A variational inference method with deep generative prior for
solving high-dimensional inverse problems [0.7734726150561089]
We propose a novel approximation method for estimating the high-dimensional posterior distribution.
This approach leverages a deep generative model to learn a prior model capable of generating spatially-varying parameters.
The proposed method can be fully implemented in an automatic differentiation manner.
arXiv Detail & Related papers (2023-02-22T06:48:10Z) - Query-Efficient Black-box Adversarial Attacks Guided by a Transfer-based
Prior [50.393092185611536]
We consider the black-box adversarial setting, where the adversary needs to craft adversarial examples without access to the gradients of a target model.
Previous methods attempted to approximate the true gradient either by using the transfer gradient of a surrogate white-box model or based on the feedback of model queries.
We propose two prior-guided random gradient-free (PRGF) algorithms based on biased sampling and gradient averaging.
arXiv Detail & Related papers (2022-03-13T04:06:27Z) - Differentiable Annealed Importance Sampling and the Perils of Gradient
Noise [68.44523807580438]
Annealed importance sampling (AIS) and related algorithms are highly effective tools for marginal likelihood estimation.
Differentiability is a desirable property as it would admit the possibility of optimizing marginal likelihood as an objective.
We propose a differentiable algorithm by abandoning Metropolis-Hastings steps, which further unlocks mini-batch computation.
arXiv Detail & Related papers (2021-07-21T17:10:14Z) - A Hybrid Gradient Method to Designing Bayesian Experiments for Implicit
Models [3.437223569602425]
The optimal design is usually achieved by maximizing the mutual information (MI) between the data and the model parameters.
When the analytical expression of the MI is unavailable, e.g., having implicit models with intractable data distributions, a neural network-based lower bound of the MI was recently proposed and a gradient ascent method was used to maximize the lower bound.
We propose a hybrid approach that leverages recent advances in variational MI estimator and evolution strategies (ES) combined with black-box gradient ascent (SGA) to maximize the MI lower bound.
arXiv Detail & Related papers (2021-03-14T21:10:03Z) - Zeroth-Order Hybrid Gradient Descent: Towards A Principled Black-Box
Optimization Framework [100.36569795440889]
This work is on the iteration of zero-th-order (ZO) optimization which does not require first-order information.
We show that with a graceful design in coordinate importance sampling, the proposed ZO optimization method is efficient both in terms of complexity as well as as function query cost.
arXiv Detail & Related papers (2020-12-21T17:29:58Z) - Path Sample-Analytic Gradient Estimators for Stochastic Binary Networks [78.76880041670904]
In neural networks with binary activations and or binary weights the training by gradient descent is complicated.
We propose a new method for this estimation problem combining sampling and analytic approximation steps.
We experimentally show higher accuracy in gradient estimation and demonstrate a more stable and better performing training in deep convolutional models.
arXiv Detail & Related papers (2020-06-04T21:51:21Z) - A Near-Optimal Gradient Flow for Learning Neural Energy-Based Models [93.24030378630175]
We propose a novel numerical scheme to optimize the gradient flows for learning energy-based models (EBMs)
We derive a second-order Wasserstein gradient flow of the global relative entropy from Fokker-Planck equation.
Compared with existing schemes, Wasserstein gradient flow is a smoother and near-optimal numerical scheme to approximate real data densities.
arXiv Detail & Related papers (2019-10-31T02:26:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.