Targeted active learning for probabilistic models
- URL: http://arxiv.org/abs/2210.12122v1
- Date: Fri, 21 Oct 2022 17:22:03 GMT
- Title: Targeted active learning for probabilistic models
- Authors: Christopher Tosh and Mauricio Tec and Wesley Tansey
- Abstract summary: A fundamental task in science is to design experiments that yield valuable insights about the system under study.
We present PDBAL, a targeted active learning method that adaptively designs experiments to maximize scientific utility.
- Score: 8.615625517708324
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: A fundamental task in science is to design experiments that yield valuable
insights about the system under study. Mathematically, these insights can be
represented as a utility or risk function that shapes the value of conducting
each experiment. We present PDBAL, a targeted active learning method that
adaptively designs experiments to maximize scientific utility. PDBAL takes a
user-specified risk function and combines it with a probabilistic model of the
experimental outcomes to choose designs that rapidly converge on a high-utility
model. We prove theoretical bounds on the label complexity of PDBAL and provide
fast closed-form solutions for designing experiments with common exponential
family likelihoods. In simulation studies, PDBAL consistently outperforms
standard untargeted approaches that focus on maximizing expected information
gain over the design space. Finally, we demonstrate the scientific potential of
PDBAL through a study on a large cancer drug screen dataset where PDBAL quickly
recovers the most efficacious drugs with a small fraction of the total number
of experiments.
Related papers
- Physical formula enhanced multi-task learning for pharmacokinetics prediction [54.13787789006417]
A major challenge for AI-driven drug discovery is the scarcity of high-quality data.
We develop a formula enhanced mul-ti-task learning (PEMAL) method that predicts four key parameters of pharmacokinetics simultaneously.
Our experiments reveal that PEMAL significantly lowers the data demand, compared to typical Graph Neural Networks.
arXiv Detail & Related papers (2024-04-16T07:42:55Z) - DiscoBAX: Discovery of Optimal Intervention Sets in Genomic Experiment
Design [61.48963555382729]
We propose DiscoBAX as a sample-efficient method for maximizing the rate of significant discoveries per experiment.
We provide theoretical guarantees of approximate optimality under standard assumptions, and conduct a comprehensive experimental evaluation.
arXiv Detail & Related papers (2023-12-07T06:05:39Z) - PIGNet2: A Versatile Deep Learning-based Protein-Ligand Interaction
Prediction Model for Binding Affinity Scoring and Virtual Screening [0.0]
Prediction of protein-ligand interactions (PLI) plays a crucial role in drug discovery.
The development of a versatile model capable of accurately scoring binding affinity and conducting efficient virtual screening remains a challenge.
Here, we propose a viable solution by introducing a novel data augmentation strategy combined with a physics-informed graph neural network.
arXiv Detail & Related papers (2023-07-03T14:46:49Z) - Machine learning enabled experimental design and parameter estimation
for ultrafast spin dynamics [54.172707311728885]
We introduce a methodology that combines machine learning with Bayesian optimal experimental design (BOED)
Our method employs a neural network model for large-scale spin dynamics simulations for precise distribution and utility calculations in BOED.
Our numerical benchmarks demonstrate the superior performance of our method in guiding XPFS experiments, predicting model parameters, and yielding more informative measurements within limited experimental time.
arXiv Detail & Related papers (2023-06-03T06:19:20Z) - Designing Optimal Behavioral Experiments Using Machine Learning [8.759299724881219]
We provide a tutorial on leveraging recent advances in BOED and machine learning to find optimal experiments for any kind of model.
We consider theories of how people balance exploration and exploitation in multi-armed bandit decision-making tasks.
As compared to experimental designs commonly used in the literature, we show that our optimal designs more efficiently determine which of a set of models best account for individual human behavior.
arXiv Detail & Related papers (2023-05-12T18:24:30Z) - Online simulator-based experimental design for cognitive model selection [74.76661199843284]
We propose BOSMOS: an approach to experimental design that can select between computational models without tractable likelihoods.
In simulated experiments, we demonstrate that the proposed BOSMOS technique can accurately select models in up to 2 orders of magnitude less time than existing LFI alternatives.
arXiv Detail & Related papers (2023-03-03T21:41:01Z) - GFlowNets for AI-Driven Scientific Discovery [74.27219800878304]
We present a new probabilistic machine learning framework called GFlowNets.
GFlowNets can be applied in the modeling, hypotheses generation and experimental design stages of the experimental science loop.
We argue that GFlowNets can become a valuable tool for AI-driven scientific discovery.
arXiv Detail & Related papers (2023-02-01T17:29:43Z) - SSM-DTA: Breaking the Barriers of Data Scarcity in Drug-Target Affinity
Prediction [127.43571146741984]
Drug-Target Affinity (DTA) is of vital importance in early-stage drug discovery.
wet experiments remain the most reliable method, but they are time-consuming and resource-intensive.
Existing methods have primarily focused on developing techniques based on the available DTA data, without adequately addressing the data scarcity issue.
We present the SSM-DTA framework, which incorporates three simple yet highly effective strategies.
arXiv Detail & Related papers (2022-06-20T14:53:25Z) - Bayesian Optimal Experimental Design for Simulator Models of Cognition [14.059933880568908]
We combine recent advances in BOED and approximate inference for intractable models to find optimal experimental designs.
Our simulation experiments on multi-armed bandit tasks show that our method results in improved model discrimination and parameter estimation.
arXiv Detail & Related papers (2021-10-29T09:04:01Z) - GeneDisco: A Benchmark for Experimental Design in Drug Discovery [41.6425999218259]
In vitro cellular experimentation with genetic interventions is an essential step in early-stage drug discovery.
GeneDisco is a benchmark suite for evaluating active learning algorithms for experimental design in drug discovery.
arXiv Detail & Related papers (2021-10-22T16:01:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.