Related papers: Bayesian Sequential Optimal Experimental Design for Nonlinear Models Using Policy Gradient Reinforcement Learning

Bayesian Sequential Optimal Experimental Design for Nonlinear Models Using Policy Gradient Reinforcement Learning

URL: http://arxiv.org/abs/2110.15335v1
Date: Thu, 28 Oct 2021 17:47:31 GMT
Title: Bayesian Sequential Optimal Experimental Design for Nonlinear Models Using Policy Gradient Reinforcement Learning
Authors: Wanggang Shen, Xun Huan
Abstract summary: We formulate this sequential optimal experimental design (sOED) problem as a finite-horizon partially observable Markov decision process (POMDP) It is built to accommodate continuous random variables, general non-Gaussian posteriors, and expensive nonlinear forward models. We solve for the sOED policy numerically via policy gradient (PG) methods from reinforcement learning, and derive and prove the PG expression for sOED. The overall PG-sOED method is validated on a linear-Gaussian benchmark, and its advantages over batch and greedy designs are demonstrated through a contaminant source inversion problem in a
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We present a mathematical framework and computational methods to optimally design a finite number of sequential experiments. We formulate this sequential optimal experimental design (sOED) problem as a finite-horizon partially observable Markov decision process (POMDP) in a Bayesian setting and with information-theoretic utilities. It is built to accommodate continuous random variables, general non-Gaussian posteriors, and expensive nonlinear forward models. sOED then seeks an optimal design policy that incorporates elements of both feedback and lookahead, generalizing the suboptimal batch and greedy designs. We solve for the sOED policy numerically via policy gradient (PG) methods from reinforcement learning, and derive and prove the PG expression for sOED. Adopting an actor-critic approach, we parameterize the policy and value functions using deep neural networks and improve them using gradient estimates produced from simulated episodes of designs and observations. The overall PG-sOED method is validated on a linear-Gaussian benchmark, and its advantages over batch and greedy designs are demonstrated through a contaminant source inversion problem in a convection-diffusion field.

Related papers

Gradient-Free Sequential Bayesian Experimental Design via Interacting Particle Systems [2.624902795082451]
We introduce a gradient-free framework for Bayesian Optimal Experimental Design (BOED) in sequential settings. Our method combines Ensemble Kalman Inversion (EKI) for design optimization with the Affine-Invariant Langevin Dynamics (ALDI) sampler for efficient posterior sampling. variational Gaussian and parametrized Laplace approximations provide tractable upper and lower bounds on the Expected Information Gain.
arXiv Detail & Related papers (2025-04-17T20:16:15Z)
A Simulation-Free Deep Learning Approach to Stochastic Optimal Control [12.699529713351287]
We propose a simulation-free algorithm for the solution of generic problems in optimal control (SOC) Unlike existing methods, our approach does not require the solution of an adjoint problem.
arXiv Detail & Related papers (2024-10-07T16:16:53Z)
Efficient Learning of POMDPs with Known Observation Model in Average-Reward Setting [56.92178753201331]
We propose the Observation-Aware Spectral (OAS) estimation technique, which enables the POMDP parameters to be learned from samples collected using a belief-based policy. We show the consistency of the OAS procedure, and we prove a regret guarantee of order $mathcalO(sqrtT log(T)$ for the proposed OAS-UCRL algorithm.
arXiv Detail & Related papers (2024-10-02T08:46:34Z)
Variational Bayesian Optimal Experimental Design with Normalizing Flows [0.837622912636323]
Variational OED estimates a lower bound of the EIG without likelihood evaluations. We introduce the use of normalizing flows for representing variational distributions in vOED. We show that a composition of 4--5 layers is able to achieve lower EIG estimation bias.
arXiv Detail & Related papers (2024-04-08T14:44:21Z)
Stochastic Gradient Descent for Gaussian Processes Done Right [86.83678041846971]
We show that when emphdone right -- by which we mean using specific insights from optimisation and kernel communities -- gradient descent is highly effective. We introduce a emphstochastic dual descent algorithm, explain its design in an intuitive manner and illustrate the design choices. Our method places Gaussian process regression on par with state-of-the-art graph neural networks for molecular binding affinity prediction.
arXiv Detail & Related papers (2023-10-31T16:15:13Z)
Optimizing Solution-Samplers for Combinatorial Problems: The Landscape of Policy-Gradient Methods [52.0617030129699]
We introduce a novel theoretical framework for analyzing the effectiveness of DeepMatching Networks and Reinforcement Learning methods. Our main contribution holds for a broad class of problems including Max-and Min-Cut, Max-$k$-Bipartite-Bi, Maximum-Weight-Bipartite-Bi, and Traveling Salesman Problem. As a byproduct of our analysis we introduce a novel regularization process over vanilla descent and provide theoretical and experimental evidence that it helps address vanishing-gradient issues and escape bad stationary points.
arXiv Detail & Related papers (2023-10-08T23:39:38Z)
Variational Sequential Optimal Experimental Design using Reinforcement Learning [0.0]
We introduce variational sequential Optimal Experimental Design (vsOED), a new method for optimally designing a finite sequence of experiments under a Bayesian framework and with information-gain utilities. Our vsOED results indicate substantially improved sample efficiency and reduced number of forward model simulations compared to previous sequential design algorithms.
arXiv Detail & Related papers (2023-06-17T21:47:19Z)
Variational Linearized Laplace Approximation for Bayesian Deep Learning [11.22428369342346]
We propose a new method for approximating Linearized Laplace Approximation (LLA) using a variational sparse Gaussian Process (GP) Our method is based on the dual RKHS formulation of GPs and retains, as the predictive mean, the output of the original DNN. It allows for efficient optimization, which results in sub-linear training time in the size of the training dataset.
arXiv Detail & Related papers (2023-02-24T10:32:30Z)
Stochastic Learning Approach to Binary Optimization for Optimal Design of Experiments [0.0]
We present a novel approach to binary optimization for optimal experimental design (OED) for Bayesian inverse problems governed by mathematical models such as partial differential equations. The OED utility function, namely, the regularized optimality gradient, is cast into an objective function in the form of an expectation over a Bernoulli distribution. The objective is then solved by using a probabilistic optimization routine to find an optimal observational policy.
arXiv Detail & Related papers (2021-01-15T03:54:12Z)
Zeroth-Order Hybrid Gradient Descent: Towards A Principled Black-Box Optimization Framework [100.36569795440889]
This work is on the iteration of zero-th-order (ZO) optimization which does not require first-order information. We show that with a graceful design in coordinate importance sampling, the proposed ZO optimization method is efficient both in terms of complexity as well as as function query cost.
arXiv Detail & Related papers (2020-12-21T17:29:58Z)
An AI-Assisted Design Method for Topology Optimization Without Pre-Optimized Training Data [68.8204255655161]
An AI-assisted design method based on topology optimization is presented, which is able to obtain optimized designs in a direct way. Designs are provided by an artificial neural network, the predictor, on the basis of boundary conditions and degree of filling as input data.
arXiv Detail & Related papers (2020-12-11T14:33:27Z)
Optimal Bayesian experimental design for subsurface flow problems [77.34726150561087]
We propose a novel approach for development of chaos expansion (PCE) surrogate model for the design utility function. This novel technique enables the derivation of a reasonable quality response surface for the targeted objective function with a computational budget comparable to several single-point evaluations.
arXiv Detail & Related papers (2020-08-10T09:42:59Z)

This list is automatically generated from the titles and abstracts of the papers in this site.