MFBind: a Multi-Fidelity Approach for Evaluating Drug Compounds in
Practical Generative Modeling
- URL: http://arxiv.org/abs/2402.10387v1
- Date: Fri, 16 Feb 2024 00:48:20 GMT
- Title: MFBind: a Multi-Fidelity Approach for Evaluating Drug Compounds in
Practical Generative Modeling
- Authors: Peter Eckmann, Dongxia Wu, Germano Heinzelmann, Michael K Gilson, Rose
Yu
- Abstract summary: Current generative models for drug discovery primarily use molecular docking to evaluate the quality of generated compounds.
We propose a multi-fidelity approach, Multi-Fidelity Bind (MFBind), to achieve the optimal trade-off between accuracy and computational cost.
- Score: 17.689451879343014
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Current generative models for drug discovery primarily use molecular docking
to evaluate the quality of generated compounds. However, such models are often
not useful in practice because even compounds with high docking scores do not
consistently show experimental activity. More accurate methods for activity
prediction exist, such as molecular dynamics based binding free energy
calculations, but they are too computationally expensive to use in a generative
model. We propose a multi-fidelity approach, Multi-Fidelity Bind (MFBind), to
achieve the optimal trade-off between accuracy and computational cost. MFBind
integrates docking and binding free energy simulators to train a multi-fidelity
deep surrogate model with active learning. Our deep surrogate model utilizes a
pretraining technique and linear prediction heads to efficiently fit small
amounts of high-fidelity data. We perform extensive experiments and show that
MFBind (1) outperforms other state-of-the-art single and multi-fidelity
baselines in surrogate modeling, and (2) boosts the performance of generative
models with markedly higher quality compounds.
Related papers
- MF-LAL: Drug Compound Generation Using Multi-Fidelity Latent Space Active Learning [16.48834281521776]
Multi-Fidelity Latent space Active Learning (MF-LAL) is a generative modeling framework that integrates a set of oracles with varying cost-accuracy tradeoffs.
We show that MF-LAL produces compounds with significantly better binding free energy scores than other single and multi-fidelity approaches.
arXiv Detail & Related papers (2024-10-15T03:15:05Z) - Assessing Non-Nested Configurations of Multifidelity Machine Learning for Quantum-Chemical Properties [0.0]
Multifidelity machine learning (MFML) for quantum chemical (QC) properties has seen strong development in the recent years.
This work assesses the use of non-nested training data for two of these multifidelity methods, namely MFML and optimized MFML.
arXiv Detail & Related papers (2024-07-24T08:34:08Z) - EMR-Merging: Tuning-Free High-Performance Model Merging [55.03509900949149]
We show that Elect, Mask & Rescale-Merging (EMR-Merging) shows outstanding performance compared to existing merging methods.
EMR-Merging is tuning-free, thus requiring no data availability or any additional training while showing impressive performance.
arXiv Detail & Related papers (2024-05-23T05:25:45Z) - On Least Square Estimation in Softmax Gating Mixture of Experts [78.3687645289918]
We investigate the performance of the least squares estimators (LSE) under a deterministic MoE model.
We establish a condition called strong identifiability to characterize the convergence behavior of various types of expert functions.
Our findings have important practical implications for expert selection.
arXiv Detail & Related papers (2024-02-05T12:31:18Z) - A Multi-Grained Symmetric Differential Equation Model for Learning
Protein-Ligand Binding Dynamics [74.93549765488103]
In drug discovery, molecular dynamics simulation provides a powerful tool for predicting binding affinities, estimating transport properties, and exploring pocket sites.
We propose NeuralMD, the first machine learning surrogate that can facilitate numerical MD and provide accurate simulations in protein-ligand binding.
We show the efficiency and effectiveness of NeuralMD, with a 2000$times$ speedup over standard numerical MD simulation and outperforming all other ML approaches by up to 80% under the stability metric.
arXiv Detail & Related papers (2024-01-26T09:35:17Z) - Integrating Chemical Language and Molecular Graph in Multimodal Fused Deep Learning for Drug Property Prediction [9.388979080270103]
We construct multimodal deep learning models to cover different molecular representations.
Compared with mono-modal models, our multimodal fused deep learning (MMFDL) models outperform single models in accuracy, reliability, and resistance capability against noise.
arXiv Detail & Related papers (2023-12-29T07:19:42Z) - Value function estimation using conditional diffusion models for control [62.27184818047923]
We propose a simple algorithm called Diffused Value Function (DVF)
It learns a joint multi-step model of the environment-robot interaction dynamics using a diffusion model.
We show how DVF can be used to efficiently capture the state visitation measure for multiple controllers.
arXiv Detail & Related papers (2023-06-09T18:40:55Z) - Multi-fidelity Hierarchical Neural Processes [79.0284780825048]
Multi-fidelity surrogate modeling reduces the computational cost by fusing different simulation outputs.
We propose Multi-fidelity Hierarchical Neural Processes (MF-HNP), a unified neural latent variable model for multi-fidelity surrogate modeling.
We evaluate MF-HNP on epidemiology and climate modeling tasks, achieving competitive performance in terms of accuracy and uncertainty estimation.
arXiv Detail & Related papers (2022-06-10T04:54:13Z) - Sparse MoEs meet Efficient Ensembles [49.313497379189315]
We study the interplay of two popular classes of such models: ensembles of neural networks and sparse mixture of experts (sparse MoEs)
We present Efficient Ensemble of Experts (E$3$), a scalable and simple ensemble of sparse MoEs that takes the best of both classes of models, while using up to 45% fewer FLOPs than a deep ensemble.
arXiv Detail & Related papers (2021-10-07T11:58:35Z) - Adaptive Reliability Analysis for Multi-fidelity Models using a
Collective Learning Strategy [6.368679897630892]
This study presents a new approach called adaptive multi-fidelity Gaussian process for reliability analysis (AMGPRA)
It is shown that the proposed method achieves similar or higher accuracy with reduced computational costs compared to state-of-the-art single and multi-fidelity methods.
A key application of AMGPRA is high-fidelity fragility modeling using complex and costly physics-based computational models.
arXiv Detail & Related papers (2021-09-21T14:42:58Z) - A data-driven peridynamic continuum model for upscaling molecular
dynamics [3.1196544696082613]
We propose a learning framework to extract, from molecular dynamics data, an optimal Linear Peridynamic Solid model.
We provide sufficient well-posedness conditions for discretized LPS models with sign-changing influence functions.
This framework guarantees that the resulting model is mathematically well-posed, physically consistent, and that it generalizes well to settings that are different from the ones used during training.
arXiv Detail & Related papers (2021-08-04T07:07:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.