Reinforcement Learning based Sequential Batch-sampling for Bayesian
Optimal Experimental Design
- URL: http://arxiv.org/abs/2112.10944v2
- Date: Thu, 23 Dec 2021 07:15:09 GMT
- Title: Reinforcement Learning based Sequential Batch-sampling for Bayesian
Optimal Experimental Design
- Authors: Yonatan Ashenafi, Piyush Pandita, Sayan Ghosh
- Abstract summary: Sequential design of experiments (SDOE) is a popular suite of methods, that has yielded promising results in recent years.
In this work, we aim to extend the SDOE strategy, to query the experiment or computer code at a batch of inputs.
A unique capability of the proposed methodology is its ability to be applied to multiple tasks, for example optimization of a function, once its trained.
- Score: 1.6249267147413522
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Engineering problems that are modeled using sophisticated mathematical
methods or are characterized by expensive-to-conduct tests or experiments, are
encumbered with limited budget or finite computational resources. Moreover,
practical scenarios in the industry, impose restrictions, based on logistics
and preference, on the manner in which the experiments can be conducted. For
example, material supply may enable only a handful of experiments in a
single-shot or in the case of computational models one may face significant
wait-time based on shared computational resources. In such scenarios, one
usually resorts to performing experiments in a manner that allows for
maximizing one's state-of-knowledge while satisfying the above mentioned
practical constraints. Sequential design of experiments (SDOE) is a popular
suite of methods, that has yielded promising results in recent years across
different engineering and practical problems. A common strategy, that leverages
Bayesian formalism is the Bayesian SDOE, which usually works best in the
one-step-ahead or myopic scenario of selecting a single experiment at each step
of a sequence of experiments. In this work, we aim to extend the SDOE strategy,
to query the experiment or computer code at a batch of inputs. To this end, we
leverage deep reinforcement learning (RL) based policy gradient methods, to
propose batches of queries that are selected taking into account entire budget
in hand. The algorithm retains the sequential nature, inherent in the SDOE,
while incorporating elements of reward based on task from the domain of deep
RL. A unique capability of the proposed methodology is its ability to be
applied to multiple tasks, for example optimization of a function, once its
trained. We demonstrate the performance of the proposed algorithm on a
synthetic problem, and a challenging high-dimensional engineering problem.
Related papers
- CoPS: Empowering LLM Agents with Provable Cross-Task Experience Sharing [70.25689961697523]
We propose a generalizable algorithm that enhances sequential reasoning by cross-task experience sharing and selection.
Our work bridges the gap between existing sequential reasoning paradigms and validates the effectiveness of leveraging cross-task experiences.
arXiv Detail & Related papers (2024-10-22T03:59:53Z) - Globally-Optimal Greedy Experiment Selection for Active Sequential
Estimation [1.1530723302736279]
We study the problem of active sequential estimation, which involves adaptively selecting experiments for sequentially collected data.
The goal is to design experiment selection rules for more accurate model estimation.
We propose a class of greedy experiment selection methods and provide statistical analysis for the maximum likelihood.
arXiv Detail & Related papers (2024-02-13T17:09:29Z) - An Experimental Design Framework for Label-Efficient Supervised Finetuning of Large Language Models [55.01592097059969]
Supervised finetuning on instruction datasets has played a crucial role in achieving the remarkable zero-shot generalization capabilities.
Active learning is effective in identifying useful subsets of samples to annotate from an unlabeled pool.
We propose using experimental design to circumvent the computational bottlenecks of active learning.
arXiv Detail & Related papers (2024-01-12T16:56:54Z) - Task-specific experimental design for treatment effect estimation [59.879567967089145]
Large randomised trials (RCTs) are the standard for causal inference.
Recent work has proposed more sample-efficient alternatives to RCTs, but these are not adaptable to the downstream application for which the causal effect is sought.
We develop a task-specific approach to experimental design and derive sampling strategies customised to particular downstream applications.
arXiv Detail & Related papers (2023-06-08T18:10:37Z) - Active Exploration via Experiment Design in Markov Chains [86.41407938210193]
A key challenge in science and engineering is to design experiments to learn about some unknown quantity of interest.
We propose an algorithm that efficiently selects policies whose measurement allocation converges to the optimal one.
In addition to our theoretical analysis, we showcase our framework on applications in ecological surveillance and pharmacology.
arXiv Detail & Related papers (2022-06-29T00:04:40Z) - Exploring Viable Algorithmic Options for Learning from Demonstration
(LfD): A Parameterized Complexity Approach [0.0]
In this paper, we show how such a systematic exploration of algorithmic options can be done using parameterized complexity analysis.
We show that none of our problems can be solved efficiently either in general or relative to a number of (often simultaneous) restrictions on environments, demonstrations, and policies.
arXiv Detail & Related papers (2022-05-10T15:54:06Z) - Policy-Based Bayesian Experimental Design for Non-Differentiable
Implicit Models [25.00242490764664]
Reinforcement Learning for Deep Adaptive Design (RL-DAD) is a method for simulation-based optimal experimental design for non-differentiable implicit models.
RL-DAD maps prior histories to experiment designs offline and can be quickly deployed during online execution.
arXiv Detail & Related papers (2022-03-08T18:47:01Z) - An Actor-Critic Method for Simulation-Based Optimization [6.261751912603047]
We focus on a simulation-based optimization problem of choosing the best design from the feasible space.
We formulate the sampling process as a policy searching problem and give a solution from the perspective of Reinforcement Learning (RL)
Some experiments are designed to validate the effectiveness of proposed algorithms.
arXiv Detail & Related papers (2021-10-31T09:04:23Z) - Output Space Entropy Search Framework for Multi-Objective Bayesian
Optimization [32.856318660282255]
Black-box multi-objective optimization (MOO) using expensive function evaluations (also referred to as experiments)
We propose a general framework for solving MOO problems based on the principle of output space entropy (OSE) search.
Our OSE search based algorithms improve over state-of-the-art methods in terms of both computational-efficiency and accuracy of MOO solutions.
arXiv Detail & Related papers (2021-10-13T18:43:39Z) - Provably Efficient Reward-Agnostic Navigation with Linear Value
Iteration [143.43658264904863]
We show how iteration under a more standard notion of low inherent Bellman error, typically employed in least-square value-style algorithms, can provide strong PAC guarantees on learning a near optimal value function.
We present a computationally tractable algorithm for the reward-free setting and show how it can be used to learn a near optimal policy for any (linear) reward function.
arXiv Detail & Related papers (2020-08-18T04:34:21Z) - Optimal Bayesian experimental design for subsurface flow problems [77.34726150561087]
We propose a novel approach for development of chaos expansion (PCE) surrogate model for the design utility function.
This novel technique enables the derivation of a reasonable quality response surface for the targeted objective function with a computational budget comparable to several single-point evaluations.
arXiv Detail & Related papers (2020-08-10T09:42:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.