MF-LAL: Drug Compound Generation Using Multi-Fidelity Latent Space Active Learning
- URL: http://arxiv.org/abs/2410.11226v1
- Date: Tue, 15 Oct 2024 03:15:05 GMT
- Title: MF-LAL: Drug Compound Generation Using Multi-Fidelity Latent Space Active Learning
- Authors: Peter Eckmann, Dongxia Wu, Germano Heinzelmann, Michael K Gilson, Rose Yu,
- Abstract summary: Multi-Fidelity Latent space Active Learning (MF-LAL) is a generative modeling framework that integrates a set of oracles with varying cost-accuracy tradeoffs.
We show that MF-LAL produces compounds with significantly better binding free energy scores than other single and multi-fidelity approaches.
- Score: 16.48834281521776
- License:
- Abstract: Current generative models for drug discovery primarily use molecular docking as an oracle to guide the generation of active compounds. However, such models are often not useful in practice because even compounds with high docking scores do not consistently show experimental activity. More accurate methods for activity prediction exist, such as molecular dynamics based binding free energy calculations, but they are too computationally expensive to use in a generative model. To address this challenge, we propose Multi-Fidelity Latent space Active Learning (MF-LAL), a generative modeling framework that integrates a set of oracles with varying cost-accuracy tradeoffs. Unlike previous approaches that separately learn the surrogate model and generative model, MF-LAL combines the generative and multi-fidelity surrogate models into a single framework, allowing for more accurate activity prediction and higher quality samples. We train MF-LAL with a novel active learning algorithm to further reduce computational cost. Our experiments on two disease-relevant proteins show that MF-LAL produces compounds with significantly better binding free energy scores than other single and multi-fidelity approaches.
Related papers
- Assessing Non-Nested Configurations of Multifidelity Machine Learning for Quantum-Chemical Properties [0.0]
Multifidelity machine learning (MFML) for quantum chemical (QC) properties has seen strong development in the recent years.
This work assesses the use of non-nested training data for two of these multifidelity methods, namely MFML and optimized MFML.
arXiv Detail & Related papers (2024-07-24T08:34:08Z) - Generating Multi-Modal and Multi-Attribute Single-Cell Counts with CFGen [76.02070962797794]
We present Cell Flow for Generation, a flow-based conditional generative model for multi-modal single-cell counts.
Our results suggest improved recovery of crucial biological data characteristics while accounting for novel generative tasks.
arXiv Detail & Related papers (2024-07-16T14:05:03Z) - MFBind: a Multi-Fidelity Approach for Evaluating Drug Compounds in
Practical Generative Modeling [17.689451879343014]
Current generative models for drug discovery primarily use molecular docking to evaluate the quality of generated compounds.
We propose a multi-fidelity approach, Multi-Fidelity Bind (MFBind), to achieve the optimal trade-off between accuracy and computational cost.
arXiv Detail & Related papers (2024-02-16T00:48:20Z) - A Multi-Grained Symmetric Differential Equation Model for Learning
Protein-Ligand Binding Dynamics [74.93549765488103]
In drug discovery, molecular dynamics simulation provides a powerful tool for predicting binding affinities, estimating transport properties, and exploring pocket sites.
We propose NeuralMD, the first machine learning surrogate that can facilitate numerical MD and provide accurate simulations in protein-ligand binding.
We show the efficiency and effectiveness of NeuralMD, with a 2000$times$ speedup over standard numerical MD simulation and outperforming all other ML approaches by up to 80% under the stability metric.
arXiv Detail & Related papers (2024-01-26T09:35:17Z) - Diffusion-Generative Multi-Fidelity Learning for Physical Simulation [24.723536390322582]
We develop a diffusion-generative multi-fidelity learning method based on differential equations (SDE), where the generation is a continuous denoising process.
By conditioning on additional inputs (temporal or spacial variables), our model can efficiently learn and predict multi-dimensional solution arrays.
arXiv Detail & Related papers (2023-11-09T18:59:05Z) - Mixed Models with Multiple Instance Learning [51.440557223100164]
We introduce MixMIL, a framework integrating Generalized Linear Mixed Models (GLMM) and Multiple Instance Learning (MIL)
Our empirical results reveal that MixMIL outperforms existing MIL models in single-cell datasets.
arXiv Detail & Related papers (2023-11-04T16:42:42Z) - Value function estimation using conditional diffusion models for control [62.27184818047923]
We propose a simple algorithm called Diffused Value Function (DVF)
It learns a joint multi-step model of the environment-robot interaction dynamics using a diffusion model.
We show how DVF can be used to efficiently capture the state visitation measure for multiple controllers.
arXiv Detail & Related papers (2023-06-09T18:40:55Z) - Learning continuous models for continuous physics [94.42705784823997]
We develop a test based on numerical analysis theory to validate machine learning models for science and engineering applications.
Our results illustrate how principled numerical analysis methods can be coupled with existing ML training/testing methodologies to validate models for science and engineering applications.
arXiv Detail & Related papers (2022-02-17T07:56:46Z) - MIMOSA: Multi-constraint Molecule Sampling for Molecule Optimization [51.00815310242277]
generative models and reinforcement learning approaches made initial success, but still face difficulties in simultaneously optimizing multiple drug properties.
We propose the MultI-constraint MOlecule SAmpling (MIMOSA) approach, a sampling framework to use input molecule as an initial guess and sample molecules from the target distribution.
arXiv Detail & Related papers (2020-10-05T20:18:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.