Towards Futuristic Autonomous Experimentation--A Surprise-Reacting
Sequential Experiment Policy
- URL: http://arxiv.org/abs/2112.00600v1
- Date: Wed, 1 Dec 2021 16:14:49 GMT
- Title: Towards Futuristic Autonomous Experimentation--A Surprise-Reacting
Sequential Experiment Policy
- Authors: Imtiaz Ahmed and Satish Bukkapatnam and Bhaskar Botcha and Yu Ding
- Abstract summary: An autonomous experimentation platform in manufacturing is supposedly capable of conducting a sequential search for suitable manufacturing conditions for advanced materials.
We argue that such capability is much needed for futuristic autonomous experimentation platforms.
- Score: 3.326548149772318
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: An autonomous experimentation platform in manufacturing is supposedly capable
of conducting a sequential search for finding suitable manufacturing conditions
for advanced materials by itself or even for discovering new materials with
minimal human intervention. The core of the intelligent control of such
platforms is the policy directing sequential experiments, namely, to decide
where to conduct the next experiment based on what has been done thus far. Such
policy inevitably trades off exploitation versus exploration and the current
practice is under the Bayesian optimization framework using the expected
improvement criterion or its variants. We discuss whether it is beneficial to
trade off exploitation versus exploration by measuring the element and degree
of surprise associated with the immediate past observation. We devise a
surprise-reacting policy using two existing surprise metrics, known as the
Shannon surprise and Bayesian surprise. Our analysis shows that the
surprise-reacting policy appears to be better suited for quickly characterizing
the overall landscape of a response surface or a design place under resource
constraints. We argue that such capability is much needed for futuristic
autonomous experimentation platforms. We do not claim that we have a fully
autonomous experimentation platform, but believe that our current effort sheds
new lights or provides a different view angle as researchers are racing to
elevate the autonomy of various primitive autonomous experimentation systems.
Related papers
- Learning Optimal Deterministic Policies with Stochastic Policy Gradients [62.81324245896716]
Policy gradient (PG) methods are successful approaches to deal with continuous reinforcement learning (RL) problems.
In common practice, convergence (hyper)policies are learned only to deploy their deterministic version.
We show how to tune the exploration level used for learning to optimize the trade-off between the sample complexity and the performance of the deployed deterministic policy.
arXiv Detail & Related papers (2024-05-03T16:45:15Z) - Counterfactual Prediction Under Selective Confounding [3.6860485638625673]
This research addresses the challenge of conducting causal inference between a binary treatment and its resulting outcome when not all confounders are known.
We relax the requirement of knowing all confounders under desired treatment, which we refer to as Selective Confounding.
We provide both theoretical error bounds and empirical evidence of the effectiveness of our proposed scheme using synthetic and real-world child placement data.
arXiv Detail & Related papers (2023-10-21T16:54:59Z) - Maximum State Entropy Exploration using Predecessor and Successor
Representations [17.732962106114478]
Animals have a developed ability to explore that aids them in important tasks such as locating food.
We propose $etapsi$-Learning, a method to learn efficient exploratory policies by conditioning on past episodic experience.
arXiv Detail & Related papers (2023-06-26T16:08:26Z) - Experimentation Platforms Meet Reinforcement Learning: Bayesian
Sequential Decision-Making for Continuous Monitoring [13.62951379287041]
In this paper, we introduce a novel framework that we developed in Amazon to maximize customer experience and control opportunity cost.
We formulate the problem as a Bayesian optimal sequential decision making problem that has a unified utility function.
We show the effectiveness of this novel approach compared with existing methods via a large-scale meta-analysis on experiments in Amazon.
arXiv Detail & Related papers (2023-04-02T00:59:10Z) - Active Exploration via Experiment Design in Markov Chains [86.41407938210193]
A key challenge in science and engineering is to design experiments to learn about some unknown quantity of interest.
We propose an algorithm that efficiently selects policies whose measurement allocation converges to the optimal one.
In addition to our theoretical analysis, we showcase our framework on applications in ecological surveillance and pharmacology.
arXiv Detail & Related papers (2022-06-29T00:04:40Z) - What Should I Know? Using Meta-gradient Descent for Predictive Feature
Discovery in a Single Stream of Experience [63.75363908696257]
computational reinforcement learning seeks to construct an agent's perception of the world through predictions of future sensations.
An open challenge in this line of work is determining from the infinitely many predictions that the agent could possibly make which predictions might best support decision-making.
We introduce a meta-gradient descent process by which an agent learns what predictions to make, 2) the estimates for its chosen predictions, and 3) how to use those estimates to generate policies that maximize future reward.
arXiv Detail & Related papers (2022-06-13T21:31:06Z) - Sayer: Using Implicit Feedback to Optimize System Policies [63.992191765269396]
We develop a methodology that leverages implicit feedback to evaluate and train new system policies.
Sayer builds on two ideas from reinforcement learning to leverage data collected by an existing policy.
We show that Sayer can evaluate arbitrary policies accurately, and train new policies that outperform the production policies.
arXiv Detail & Related papers (2021-10-28T04:16:56Z) - Crowd Sensing and Living Lab Outdoor Experimentation Made Easy [2.5234156040689237]
This article introduces Smart Agora, a novel open-source software platform for rigorous systematic outdoor experimentation.
Without writing a single line of code, highly complex experimental scenarios are visually designed and automatically deployed to smart phones.
arXiv Detail & Related papers (2021-07-08T21:49:32Z) - Rethinking Exploration for Sample-Efficient Policy Learning [20.573107021603356]
We show that directed exploration methods have not been more influential in the sample efficient control problem.
Three issues have limited the applicability of BBE: bias with finite samples, slow adaptation to decaying bonuses, and lack of optimism on unseen transitions.
We propose modifications to the bonus-based exploration recipe to address each of these limitations.
The resulting algorithm, which we call UFO, produces policies that are Unbiased with finite samples, Fast-adapting as the exploration bonus changes, and Optimistic with respect to new transitions.
arXiv Detail & Related papers (2021-01-23T08:51:04Z) - Learning "What-if" Explanations for Sequential Decision-Making [92.8311073739295]
Building interpretable parameterizations of real-world decision-making on the basis of demonstrated behavior is essential.
We propose learning explanations of expert decisions by modeling their reward function in terms of preferences with respect to "what if" outcomes.
We highlight the effectiveness of our batch, counterfactual inverse reinforcement learning approach in recovering accurate and interpretable descriptions of behavior.
arXiv Detail & Related papers (2020-07-02T14:24:17Z) - Dynamic Causal Effects Evaluation in A/B Testing with a Reinforcement
Learning Framework [68.96770035057716]
A/B testing is a business strategy to compare a new product with an old one in pharmaceutical, technological, and traditional industries.
This paper introduces a reinforcement learning framework for carrying A/B testing in online experiments.
arXiv Detail & Related papers (2020-02-05T10:25:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.