Sequential Bayesian experimental designs via reinforcement learning
- URL: http://arxiv.org/abs/2202.07472v1
- Date: Mon, 14 Feb 2022 04:29:04 GMT
- Title: Sequential Bayesian experimental designs via reinforcement learning
- Authors: Hikaru Asano
- Abstract summary: We provide a new approach Sequential Experimental Design via Reinforcement Learning to construct BED in a sequential manner.
By proposing a new real-world-oriented experimental environment, our approach aims to maximize the expected information gain.
It is confirmed that our method outperforms the existing methods in various indices such as the EIG and sampling efficiency.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Bayesian experimental design (BED) has been used as a method for conducting
efficient experiments based on Bayesian inference. The existing methods,
however, mostly focus on maximizing the expected information gain (EIG); the
cost of experiments and sample efficiency are often not taken into account. In
order to address this issue and enhance practical applicability of BED, we
provide a new approach Sequential Experimental Design via Reinforcement
Learning to construct BED in a sequential manner by applying reinforcement
learning in this paper. Here, reinforcement learning is a branch of machine
learning in which an agent learns a policy to maximize its reward by
interacting with the environment. The characteristics of interacting with the
environment are similar to the sequential experiment, and reinforcement
learning is indeed a method that excels at sequential decision making.
By proposing a new real-world-oriented experimental environment, our approach
aims to maximize the EIG while keeping the cost of experiments and sample
efficiency in mind simultaneously. We conduct numerical experiments for three
different examples. It is confirmed that our method outperforms the existing
methods in various indices such as the EIG and sampling efficiency, indicating
that our proposed method and experimental environment can make a significant
contribution to application of BED to the real world.
Related papers
- Efficient Diversity-based Experience Replay for Deep Reinforcement Learning [14.96744975805832]
This paper proposes a novel approach, diversity-based experience replay (DBER), which leverages the deterministic point process to prioritize diverse samples in state realizations.
We conducted extensive experiments on Robotic Manipulation tasks in MuJoCo, Atari games, and realistic in-door environments in Habitat.
arXiv Detail & Related papers (2024-10-27T15:51:27Z) - Adaptive Experimentation When You Can't Experiment [55.86593195947978]
This paper introduces the emphconfounded pure exploration transductive linear bandit (textttCPET-LB) problem.
Online services can employ a properly randomized encouragement that incentivizes users toward a specific treatment.
arXiv Detail & Related papers (2024-06-15T20:54:48Z) - Effect Size Estimation for Duration Recommendation in Online Experiments: Leveraging Hierarchical Models and Objective Utility Approaches [13.504353263032359]
The selection of the assumed effect size (AES) critically determines the duration of an experiment, and hence its accuracy and efficiency.
Traditionally, experimenters determine AES based on domain knowledge, but this method becomes impractical for online experimentation services managing numerous experiments.
We propose two solutions for data-driven AES selection in for online experimentation services.
arXiv Detail & Related papers (2023-12-20T09:34:28Z) - Adaptive Instrument Design for Indirect Experiments [48.815194906471405]
Unlike RCTs, indirect experiments estimate treatment effects by leveragingconditional instrumental variables.
In this paper we take the initial steps towards enhancing sample efficiency for indirect experiments by adaptively designing a data collection policy.
Our main contribution is a practical computational procedure that utilizes influence functions to search for an optimal data collection policy.
arXiv Detail & Related papers (2023-12-05T02:38:04Z) - Opportunities for Adaptive Experiments to Enable Continuous Improvement in Computer Science Education [7.50867730317249]
In adaptive experiments, data is analyzed and utilized as different conditions are deployed to students.
These algorithms can then dynamically deploy the most effective conditions in subsequent interactions with students.
This work paves the way for exploring the importance of adaptive experiments in bridging research and practice to achieve continuous improvement.
arXiv Detail & Related papers (2023-10-18T20:54:59Z) - Diffusion-based Visual Counterfactual Explanations -- Towards Systematic
Quantitative Evaluation [64.0476282000118]
Latest methods for visual counterfactual explanations (VCE) harness the power of deep generative models to synthesize new examples of high-dimensional images of impressive quality.
It is currently difficult to compare the performance of these VCE methods as the evaluation procedures largely vary and often boil down to visual inspection of individual examples and small scale user studies.
We propose a framework for systematic, quantitative evaluation of the VCE methods and a minimal set of metrics to be used.
arXiv Detail & Related papers (2023-08-11T12:22:37Z) - Environment Design for Inverse Reinforcement Learning [3.085995273374333]
Current inverse reinforcement learning methods that focus on learning from a single environment can fail to handle slight changes in the environment dynamics.
In our framework, the learner repeatedly interacts with the expert, with the former selecting environments to identify the reward function.
This results in improvements in both sample-efficiency and robustness, as we show experimentally, for both exact and approximate inference.
arXiv Detail & Related papers (2022-10-26T18:31:17Z) - Design Amortization for Bayesian Optimal Experimental Design [70.13948372218849]
We build off of successful variational approaches, which optimize a parameterized variational model with respect to bounds on the expected information gain (EIG)
We present a novel neural architecture that allows experimenters to optimize a single variational model that can estimate the EIG for potentially infinitely many designs.
arXiv Detail & Related papers (2022-10-07T02:12:34Z) - TRAIL: Near-Optimal Imitation Learning with Suboptimal Data [100.83688818427915]
We present training objectives that use offline datasets to learn a factored transition model.
Our theoretical analysis shows that the learned latent action space can boost the sample-efficiency of downstream imitation learning.
To learn the latent action space in practice, we propose TRAIL (Transition-Reparametrized Actions for Imitation Learning), an algorithm that learns an energy-based transition model.
arXiv Detail & Related papers (2021-10-27T21:05:00Z) - Incorporating Expert Prior Knowledge into Experimental Design via
Posterior Sampling [58.56638141701966]
Experimenters can often acquire the knowledge about the location of the global optimum.
It is unknown how to incorporate the expert prior knowledge about the global optimum into Bayesian optimization.
An efficient Bayesian optimization approach has been proposed via posterior sampling on the posterior distribution of the global optimum.
arXiv Detail & Related papers (2020-02-26T01:57:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.