A Hybrid Gradient Method to Designing Bayesian Experiments for Implicit
Models
- URL: http://arxiv.org/abs/2103.08594v1
- Date: Sun, 14 Mar 2021 21:10:03 GMT
- Title: A Hybrid Gradient Method to Designing Bayesian Experiments for Implicit
Models
- Authors: Jiaxin Zhang, Sirui Bi, Guannan Zhang
- Abstract summary: The optimal design is usually achieved by maximizing the mutual information (MI) between the data and the model parameters.
When the analytical expression of the MI is unavailable, e.g., having implicit models with intractable data distributions, a neural network-based lower bound of the MI was recently proposed and a gradient ascent method was used to maximize the lower bound.
We propose a hybrid approach that leverages recent advances in variational MI estimator and evolution strategies (ES) combined with black-box gradient ascent (SGA) to maximize the MI lower bound.
- Score: 3.437223569602425
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Bayesian experimental design (BED) aims at designing an experiment to
maximize the information gathering from the collected data. The optimal design
is usually achieved by maximizing the mutual information (MI) between the data
and the model parameters. When the analytical expression of the MI is
unavailable, e.g., having implicit models with intractable data distributions,
a neural network-based lower bound of the MI was recently proposed and a
gradient ascent method was used to maximize the lower bound. However, the
approach in Kleinegesse et al., 2020 requires a pathwise sampling path to
compute the gradient of the MI lower bound with respect to the design
variables, and such a pathwise sampling path is usually inaccessible for
implicit models. In this work, we propose a hybrid gradient approach that
leverages recent advances in variational MI estimator and evolution strategies
(ES) combined with black-box stochastic gradient ascent (SGA) to maximize the
MI lower bound. This allows the design process to be achieved through a unified
scalable procedure for implicit models without sampling path gradients. Several
experiments demonstrate that our approach significantly improves the
scalability of BED for implicit models in high-dimensional design space.
Related papers
- Variational Learning of Gaussian Process Latent Variable Models through Stochastic Gradient Annealed Importance Sampling [22.256068524699472]
In this work, we propose an Annealed Importance Sampling (AIS) approach to address these issues.
We combine the strengths of Sequential Monte Carlo samplers and VI to explore a wider range of posterior distributions and gradually approach the target distribution.
Experimental results on both toy and image datasets demonstrate that our method outperforms state-of-the-art methods in terms of tighter variational bounds, higher log-likelihoods, and more robust convergence.
arXiv Detail & Related papers (2024-08-13T08:09:05Z) - Latent Energy-Based Odyssey: Black-Box Optimization via Expanded Exploration in the Energy-Based Latent Space [65.44449711359724]
High-dimensional and highly-multimodal input design space of black-box function pose inherent challenges for existing methods.
We consider finding a latent space that serves as a compressed yet accurate representation of the design-value joint space.
We propose Noise-intensified Telescoping density-Ratio Estimation scheme for variational learning of an accurate latent space model.
arXiv Detail & Related papers (2024-05-27T00:11:53Z) - Model-Based Reparameterization Policy Gradient Methods: Theory and
Practical Algorithms [88.74308282658133]
Reization (RP) Policy Gradient Methods (PGMs) have been widely adopted for continuous control tasks in robotics and computer graphics.
Recent studies have revealed that, when applied to long-term reinforcement learning problems, model-based RP PGMs may experience chaotic and non-smooth optimization landscapes.
We propose a spectral normalization method to mitigate the exploding variance issue caused by long model unrolls.
arXiv Detail & Related papers (2023-10-30T18:43:21Z) - Protein Design with Guided Discrete Diffusion [67.06148688398677]
A popular approach to protein design is to combine a generative model with a discriminative model for conditional sampling.
We propose diffusioN Optimized Sampling (NOS), a guidance method for discrete diffusion models.
NOS makes it possible to perform design directly in sequence space, circumventing significant limitations of structure-based methods.
arXiv Detail & Related papers (2023-05-31T16:31:24Z) - Gradient-based Bayesian Experimental Design for Implicit Models using
Mutual Information Lower Bounds [20.393359858407162]
We introduce a framework for Bayesian experimental design (BED) with implicit models, where the data-generating distribution is intractable but sampling from it is still possible.
In order to find optimal experimental designs for such models, our approach maximises mutual information lower bounds that are parametrised by neural networks.
By training a neural network on sampled data, we simultaneously update network parameters and designs using gradient-ascent.
arXiv Detail & Related papers (2021-05-10T13:59:25Z) - A Scalable Gradient-Free Method for Bayesian Experimental Design with
Implicit Models [3.437223569602425]
For implicit models, where the likelihood is intractable but sampling is possible, conventional BED methods have difficulties in efficiently estimating the posterior distribution.
Recent work proposed the use of gradient ascent to maximize a lower bound on MI to deal with these issues.
We propose a novel approach that leverages recent advances in approximate gradient ascent incorporated with a smoothed variational MI for efficient and robust BED.
arXiv Detail & Related papers (2021-03-14T20:28:51Z) - Zeroth-Order Hybrid Gradient Descent: Towards A Principled Black-Box
Optimization Framework [100.36569795440889]
This work is on the iteration of zero-th-order (ZO) optimization which does not require first-order information.
We show that with a graceful design in coordinate importance sampling, the proposed ZO optimization method is efficient both in terms of complexity as well as as function query cost.
arXiv Detail & Related papers (2020-12-21T17:29:58Z) - Path Sample-Analytic Gradient Estimators for Stochastic Binary Networks [78.76880041670904]
In neural networks with binary activations and or binary weights the training by gradient descent is complicated.
We propose a new method for this estimation problem combining sampling and analytic approximation steps.
We experimentally show higher accuracy in gradient estimation and demonstrate a more stable and better performing training in deep convolutional models.
arXiv Detail & Related papers (2020-06-04T21:51:21Z) - Bayesian Experimental Design for Implicit Models by Mutual Information
Neural Estimation [16.844481439960663]
Implicit models, where the data-generation distribution is intractable but sampling is possible, are ubiquitous in the natural sciences.
A fundamental question is how to design experiments so that the collected data are most useful.
For implicit models, however, this approach is severely hampered by the high computational cost of computing posteriors.
We show that training a neural network to maximise a lower bound on MI allows us to jointly determine the optimal design and the posterior.
arXiv Detail & Related papers (2020-02-19T12:09:42Z) - A Near-Optimal Gradient Flow for Learning Neural Energy-Based Models [93.24030378630175]
We propose a novel numerical scheme to optimize the gradient flows for learning energy-based models (EBMs)
We derive a second-order Wasserstein gradient flow of the global relative entropy from Fokker-Planck equation.
Compared with existing schemes, Wasserstein gradient flow is a smoother and near-optimal numerical scheme to approximate real data densities.
arXiv Detail & Related papers (2019-10-31T02:26:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.