Related papers: Optimizing Sequential Experimental Design with Deep Reinforcement Learning

Optimizing Sequential Experimental Design with Deep Reinforcement Learning

URL: http://arxiv.org/abs/2202.00821v1
Date: Wed, 2 Feb 2022 00:23:05 GMT
Title: Optimizing Sequential Experimental Design with Deep Reinforcement Learning
Authors: Tom Blau, Edwin Bonilla, Amir Dezfouli, Iadine Chades
Abstract summary: We show that the problem of optimizing policies can be reduced to solving a Markov decision process (MDP) Our approach is also computationally efficient at deployment time and exhibits state-of-the-art performance on both continuous and discrete design spaces.
Score: 7.589363597086081
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: Bayesian approaches developed to solve the optimal design of sequential experiments are mathematically elegant but computationally challenging. Recently, techniques using amortization have been proposed to make these Bayesian approaches practical, by training a parameterized policy that proposes designs efficiently at deployment time. However, these methods may not sufficiently explore the design space, require access to a differentiable probabilistic model and can only optimize over continuous design spaces. Here, we address these limitations by showing that the problem of optimizing policies can be reduced to solving a Markov decision process (MDP). We solve the equivalent MDP with modern deep reinforcement learning techniques. Our experiments show that our approach is also computationally efficient at deployment time and exhibits state-of-the-art performance on both continuous and discrete design spaces, even when the probabilistic model is a black box.

Related papers

Achieving $\widetilde{\mathcal{O}}(\sqrt{T})$ Regret in Average-Reward POMDPs with Known Observation Models [56.92178753201331]
We tackle average-reward infinite-horizon POMDPs with an unknown transition model. We present a novel and simple estimator that overcomes this barrier.
arXiv Detail & Related papers (2025-01-30T22:29:41Z)
Learning Joint Models of Prediction and Optimization [56.04498536842065]
Predict-Then-Then framework uses machine learning models to predict unknown parameters of an optimization problem from features before solving. This paper proposes an alternative method, in which optimal solutions are learned directly from the observable features by joint predictive models.
arXiv Detail & Related papers (2024-09-07T19:52:14Z)
Diffusion Model for Data-Driven Black-Box Optimization [54.25693582870226]
We focus on diffusion models, a powerful generative AI technology, and investigate their potential for black-box optimization. We study two practical types of labels: 1) noisy measurements of a real-valued reward function and 2) human preference based on pairwise comparisons. Our proposed method reformulates the design optimization problem into a conditional sampling problem, which allows us to leverage the power of diffusion models.
arXiv Detail & Related papers (2024-03-20T00:41:12Z)
An Adaptive Dimension Reduction Estimation Method for High-dimensional Bayesian Optimization [6.79843988450982]
We propose a two-step optimization framework to extend BO to high-dimensional settings. Our algorithm offers the flexibility to operate these steps either concurrently or in sequence. Numerical experiments validate the efficacy of our method in challenging scenarios.
arXiv Detail & Related papers (2024-03-08T16:21:08Z)
End-to-End Learning for Fair Multiobjective Optimization Under Uncertainty [55.04219793298687]
The Predict-Then-Forecast (PtO) paradigm in machine learning aims to maximize downstream decision quality. This paper extends the PtO methodology to optimization problems with nondifferentiable Ordered Weighted Averaging (OWA) objectives. It shows how optimization of OWA functions can be effectively integrated with parametric prediction for fair and robust optimization under uncertainty.
arXiv Detail & Related papers (2024-02-12T16:33:35Z)
Differentiable Multi-Target Causal Bayesian Experimental Design [43.76697029708785]
We introduce a gradient-based approach for the problem of Bayesian optimal experimental design to learn causal models in a batch setting. Existing methods rely on greedy approximations to construct a batch of experiments. We propose a conceptually simple end-to-end gradient-based optimization procedure to acquire a set of optimal intervention target-state pairs.
arXiv Detail & Related papers (2023-02-21T11:32:59Z)
New Paradigms for Exploiting Parallel Experiments in Bayesian Optimization [0.0]
We present new parallel BO paradigms that exploit the structure of the system to partition the design space. Specifically, we propose an approach that partitions the design space by following the level sets of the performance function. Our results show that our approaches significantly reduce the required search time and increase the probability of finding a global (rather than local) solution.
arXiv Detail & Related papers (2022-10-03T16:45:23Z)
Multi-Objective Policy Gradients with Topological Constraints [108.10241442630289]
We present a new algorithm for a policy gradient in TMDPs by a simple extension of the proximal policy optimization (PPO) algorithm. We demonstrate this on a real-world multiple-objective navigation problem with an arbitrary ordering of objectives both in simulation and on a real robot.
arXiv Detail & Related papers (2022-09-15T07:22:58Z)
An Actor-Critic Method for Simulation-Based Optimization [6.261751912603047]
We focus on a simulation-based optimization problem of choosing the best design from the feasible space. We formulate the sampling process as a policy searching problem and give a solution from the perspective of Reinforcement Learning (RL) Some experiments are designed to validate the effectiveness of proposed algorithms.
arXiv Detail & Related papers (2021-10-31T09:04:23Z)
An AI-Assisted Design Method for Topology Optimization Without Pre-Optimized Training Data [68.8204255655161]
An AI-assisted design method based on topology optimization is presented, which is able to obtain optimized designs in a direct way. Designs are provided by an artificial neural network, the predictor, on the basis of boundary conditions and degree of filling as input data.
arXiv Detail & Related papers (2020-12-11T14:33:27Z)
Optimal Bayesian experimental design for subsurface flow problems [77.34726150561087]
We propose a novel approach for development of chaos expansion (PCE) surrogate model for the design utility function. This novel technique enables the derivation of a reasonable quality response surface for the targeted objective function with a computational budget comparable to several single-point evaluations.
arXiv Detail & Related papers (2020-08-10T09:42:59Z)
Adaptive Discretization for Model-Based Reinforcement Learning [10.21634042036049]
We introduce the technique of adaptive discretization to design an efficient model-based episodic reinforcement learning algorithm. Our algorithm is based on optimistic one-step value iteration extended to maintain an adaptive discretization of the space.
arXiv Detail & Related papers (2020-07-01T19:36:46Z)

This list is automatically generated from the titles and abstracts of the papers in this site.