Gemini: Dynamic Bias Correction for Autonomous Experimentation and
Molecular Simulation
- URL: http://arxiv.org/abs/2103.03391v1
- Date: Fri, 5 Mar 2021 00:11:56 GMT
- Title: Gemini: Dynamic Bias Correction for Autonomous Experimentation and
Molecular Simulation
- Authors: Riley J. Hickman, Florian H\"ase, Lo\"ic M. Roch, Al\'an Aspuru-Guzik
- Abstract summary: We introduce Gemini, a data-driven model capable of using inexpensive measurements as proxies for expensive measurements.
We show that the number of measurements of a composition space comprising expensive and rare metals needed to achieve a target overpotential is significantly reduced when measurements from a proxy composition system with less expensive metals are available.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Bayesian optimization has emerged as a powerful strategy to accelerate
scientific discovery by means of autonomous experimentation. However, expensive
measurements are required to accurately estimate materials properties, and can
quickly become a hindrance to exhaustive materials discovery campaigns. Here,
we introduce Gemini: a data-driven model capable of using inexpensive
measurements as proxies for expensive measurements by correcting systematic
biases between property evaluation methods. We recommend using Gemini for
regression tasks with sparse data and in an autonomous workflow setting where
its predictions of expensive to evaluate objectives can be used to construct a
more informative acquisition function, thus reducing the number of expensive
evaluations an optimizer needs to achieve desired target values. In a
regression setting, we showcase the ability of our method to make accurate
predictions of DFT calculated bandgaps of hybrid organic-inorganic perovskite
materials. We further demonstrate the benefits that Gemini provides to
autonomous workflows by augmenting the Bayesian optimizer Phoenics to yeild a
scalable optimization framework leveraging multiple sources of measurement.
Finally, we simulate an autonomous materials discovery platform for optimizing
the activity of electrocatalysts for the oxygen evolution reaction. Realizing
autonomous workflows with Gemini, we show that the number of measurements of a
composition space comprising expensive and rare metals needed to achieve a
target overpotential is significantly reduced when measurements from a proxy
composition system with less expensive metals are available.
Related papers
- Truncating Trajectories in Monte Carlo Policy Evaluation: an Adaptive Approach [51.76826149868971]
Policy evaluation via Monte Carlo simulation is at the core of many MC Reinforcement Learning (RL) algorithms.
We propose as a quality index a surrogate of the mean squared error of a return estimator that uses trajectories of different lengths.
We present an adaptive algorithm called Robust and Iterative Data collection strategy Optimization (RIDO)
arXiv Detail & Related papers (2024-10-17T11:47:56Z) - AutoScale: Automatic Prediction of Compute-optimal Data Composition for Training LLMs [61.13296177652599]
This paper demonstrates that the optimal composition of training data from different domains is scale-dependent.
We introduce *AutoScale*, a novel, practical approach for optimizing data compositions at potentially large training data scales.
Our evaluation on GPT-2 Large and BERT pre-training demonstrates *AutoScale*'s effectiveness in improving training convergence and downstream performance.
arXiv Detail & Related papers (2024-07-29T17:06:30Z) - Simulation-Enhanced Data Augmentation for Machine Learning Pathloss
Prediction [9.664420734674088]
This paper introduces a novel simulation-enhanced data augmentation method for machine learning pathloss prediction.
Our method integrates synthetic data generated from a cellular coverage simulator and independently collected real-world datasets.
The integration of synthetic data significantly improves the generalizability of the model in different environments.
arXiv Detail & Related papers (2024-02-03T00:38:08Z) - QualEval: Qualitative Evaluation for Model Improvement [82.73561470966658]
We propose QualEval, which augments quantitative scalar metrics with automated qualitative evaluation as a vehicle for model improvement.
QualEval uses a powerful LLM reasoner and our novel flexible linear programming solver to generate human-readable insights.
We demonstrate that leveraging its insights, for example, improves the absolute performance of the Llama 2 model by up to 15% points relative.
arXiv Detail & Related papers (2023-11-06T00:21:44Z) - Enhancing Multi-Objective Optimization through Machine Learning-Supported Multiphysics Simulation [1.6685829157403116]
This paper presents a methodological framework for training, self-optimising, and self-organising surrogate models.
We show that surrogate models can be trained on relatively small amounts of data to approximate the underlying simulations accurately.
arXiv Detail & Related papers (2023-09-22T20:52:50Z) - Maximize to Explore: One Objective Function Fusing Estimation, Planning,
and Exploration [87.53543137162488]
We propose an easy-to-implement online reinforcement learning (online RL) framework called textttMEX.
textttMEX integrates estimation and planning components while balancing exploration exploitation automatically.
It can outperform baselines by a stable margin in various MuJoCo environments with sparse rewards.
arXiv Detail & Related papers (2023-05-29T17:25:26Z) - Variational Factorization Machines for Preference Elicitation in
Large-Scale Recommender Systems [17.050774091903552]
We propose a variational formulation of factorization machines (FMs) that can be easily optimized using standard mini-batch descent gradient.
Our algorithm learns an approximate posterior distribution over the user and item parameters, which leads to confidence intervals over the predictions.
We show, using several datasets, that it has comparable or better performance than existing methods in terms of prediction accuracy.
arXiv Detail & Related papers (2022-12-20T00:06:28Z) - Model Calibration of the Liquid Mercury Spallation Target using
Evolutionary Neural Networks and Sparse Polynomial Expansions [4.3634848427203945]
We present two approaches for surrogate-based model calibration of expensive simulations using evolutionary neural networks and sparse expansions.
The proposed simulations can significantly aid in fatigue analysis to estimate the mercury target lifetime and integrity.
However, an important conclusion from this work points out to a deficiency in the current model based on the equation of state in capturing the full physics of the spallation reaction.
arXiv Detail & Related papers (2022-02-18T18:47:10Z) - TRAIL: Near-Optimal Imitation Learning with Suboptimal Data [100.83688818427915]
We present training objectives that use offline datasets to learn a factored transition model.
Our theoretical analysis shows that the learned latent action space can boost the sample-efficiency of downstream imitation learning.
To learn the latent action space in practice, we propose TRAIL (Transition-Reparametrized Actions for Imitation Learning), an algorithm that learns an energy-based transition model.
arXiv Detail & Related papers (2021-10-27T21:05:00Z) - Towards fast machine-learning-assisted Bayesian posterior inference of
realistic microseismic events [0.0]
We train a machine learning algorithm on the power spectrum of the recorded pressure wave.
We show that our approach is computationally inexpensive, as it can be run in less than 1 hour on a commercial laptop.
arXiv Detail & Related papers (2021-01-12T19:51:32Z) - AutoSimulate: (Quickly) Learning Synthetic Data Generation [70.82315853981838]
We propose an efficient alternative for optimal synthetic data generation based on a novel differentiable approximation of the objective.
We demonstrate that the proposed method finds the optimal data distribution faster (up to $50times$), with significantly reduced training data generation (up to $30times$) and better accuracy ($+8.7%$) on real-world test datasets than previous methods.
arXiv Detail & Related papers (2020-08-16T11:36:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.