Automating reward function configuration for drug design
- URL: http://arxiv.org/abs/2312.09865v1
- Date: Fri, 15 Dec 2023 15:09:16 GMT
- Title: Automating reward function configuration for drug design
- Authors: Marius Urbonas, Temitope Ajileye, Paul Gainer and Douglas Pires
- Abstract summary: We propose a novel approach for automated reward configuration that relies solely on experimental data.
We show that our algorithm yields reward functions that outperform predictive the accuracy of human-defined functions.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Designing reward functions that guide generative molecular design (GMD)
algorithms to desirable areas of chemical space is of critical importance in
AI-driven drug discovery. Traditionally, this has been a manual and error-prone
task; the selection of appropriate computational methods to approximate
biological assays is challenging and the aggregation of computed values into a
single score even more so, leading to potential reliance on trial-and-error
approaches. We propose a novel approach for automated reward configuration that
relies solely on experimental data, mitigating the challenges of manual reward
adjustment on drug discovery projects. Our method achieves this by constructing
a ranking over experimental data based on Pareto dominance over the
multi-objective space, then training a neural network to approximate the reward
function such that rankings determined by the predicted reward correlate with
those determined by the Pareto dominance relation. We validate our method using
two case studies. In the first study we simulate Design-Make-Test-Analyse
(DMTA) cycles by alternating reward function updates and generative runs guided
by that function. We show that the learned function adapts over time to yield
compounds that score highly with respect to evaluation functions taken from the
literature. In the second study we apply our algorithm to historical data from
four real drug discovery projects. We show that our algorithm yields reward
functions that outperform the predictive accuracy of human-defined functions,
achieving an improvement of up to 0.4 in Spearman's correlation against a
ground truth evaluation function that encodes the target drug profile for that
project. Our method provides an efficient data-driven way to configure reward
functions for GMD, and serves as a strong baseline for future research into
transformative approaches for the automation of drug discovery.
Related papers
- Learning Off-policy with Model-based Intrinsic Motivation For Active Online Exploration [15.463313629574111]
This paper investigates how to achieve sample-efficient exploration in continuous control tasks.
We introduce an RL algorithm that incorporates a predictive model and off-policy learning elements.
We derive an intrinsic reward without incurring parameters overhead.
arXiv Detail & Related papers (2024-03-31T11:39:11Z) - Sample Complexity of Preference-Based Nonparametric Off-Policy
Evaluation with Deep Networks [58.469818546042696]
We study the sample efficiency of OPE with human preference and establish a statistical guarantee for it.
By appropriately selecting the size of a ReLU network, we show that one can leverage any low-dimensional manifold structure in the Markov decision process.
arXiv Detail & Related papers (2023-10-16T16:27:06Z) - Equation Discovery with Bayesian Spike-and-Slab Priors and Efficient Kernels [57.46832672991433]
We propose a novel equation discovery method based on Kernel learning and BAyesian Spike-and-Slab priors (KBASS)
We use kernel regression to estimate the target function, which is flexible, expressive, and more robust to data sparsity and noises.
We develop an expectation-propagation expectation-maximization algorithm for efficient posterior inference and function estimation.
arXiv Detail & Related papers (2023-10-09T03:55:09Z) - Drug Discovery under Covariate Shift with Domain-Informed Prior
Distributions over Functions [30.305418761024143]
Real-world drug discovery tasks are often characterized by a scarcity of labeled data and a significant range of data.
We present a principled way to encode explicit prior knowledge of the data-generating process into a prior distribution.
We demonstrate that using integrate Q-SAVI to contextualize prior knowledgelike chemical space into the modeling process affords substantial accuracy and calibration.
arXiv Detail & Related papers (2023-07-14T05:01:10Z) - FAStEN: An Efficient Adaptive Method for Feature Selection and Estimation in High-Dimensional Functional Regressions [7.674715791336311]
We propose a new, flexible and ultra-efficient approach to perform feature selection in a sparse function-on-function regression problem.
We show how to extend it to the scalar-on-function framework.
We present an application to brain fMRI data from the AOMIC PIOP1 study.
arXiv Detail & Related papers (2023-03-26T19:41:17Z) - MetaRF: Differentiable Random Forest for Reaction Yield Prediction with
a Few Trails [58.47364143304643]
In this paper, we focus on the reaction yield prediction problem.
We first put forth MetaRF, an attention-based differentiable random forest model specially designed for the few-shot yield prediction.
To improve the few-shot learning performance, we further introduce a dimension-reduction based sampling method.
arXiv Detail & Related papers (2022-08-22T06:40:13Z) - Leveraging Unlabeled Data to Predict Out-of-Distribution Performance [63.740181251997306]
Real-world machine learning deployments are characterized by mismatches between the source (training) and target (test) distributions.
In this work, we investigate methods for predicting the target domain accuracy using only labeled source data and unlabeled target data.
We propose Average Thresholded Confidence (ATC), a practical method that learns a threshold on the model's confidence, predicting accuracy as the fraction of unlabeled examples.
arXiv Detail & Related papers (2022-01-11T23:01:12Z) - On Reward-Free RL with Kernel and Neural Function Approximations:
Single-Agent MDP and Markov Game [140.19656665344917]
We study the reward-free RL problem, where an agent aims to thoroughly explore the environment without any pre-specified reward function.
We tackle this problem under the context of function approximation, leveraging powerful function approximators.
We establish the first provably efficient reward-free RL algorithm with kernel and neural function approximators.
arXiv Detail & Related papers (2021-10-19T07:26:33Z) - Categorical EHR Imputation with Generative Adversarial Nets [11.171712535005357]
We propose a simple and yet effective approach that is based on previous work on GANs for data imputation.
We show that our imputation approach largely improves the prediction accuracy, compared to more traditional data imputation approaches.
arXiv Detail & Related papers (2021-08-03T18:50:26Z) - Goal-directed Generation of Discrete Structures with Conditional
Generative Models [85.51463588099556]
We introduce a novel approach to directly optimize a reinforcement learning objective, maximizing an expected reward.
We test our methodology on two tasks: generating molecules with user-defined properties and identifying short python expressions which evaluate to a given target value.
arXiv Detail & Related papers (2020-10-05T20:03:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.