Related papers: Improved prediction of ligand-protein binding affinities by meta-modeling

Improved prediction of ligand-protein binding affinities by meta-modeling

URL: http://arxiv.org/abs/2310.03946v3
Date: Sat, 18 May 2024 16:56:40 GMT
Title: Improved prediction of ligand-protein binding affinities by meta-modeling
Authors: Ho-Joon Lee, Prashant S. Emani, Mark B. Gerstein,
Abstract summary: We develop a framework to integrate published force-field-based empirical docking and sequence-based deep learning models. We show that many of our meta-models significantly improve affinity predictions over base models. Our best meta-models achieve comparable performance to state-of-the-art deep learning tools exclusively based on structures.
Score: 1.3859669037499769
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: The accurate screening of candidate drug ligands against target proteins through computational approaches is of prime interest to drug development efforts. Such virtual screening depends in part on methods to predict the binding affinity between ligands and proteins. Many computational models for binding affinity prediction have been developed, but with varying results across targets. Given that ensembling or meta-modeling methods have shown great promise in reducing model-specific biases, we develop a framework to integrate published force-field-based empirical docking and sequence-based deep learning models. In building this framework, we evaluate many combinations of individual base models, training databases, and several meta-modeling approaches. We show that many of our meta-models significantly improve affinity predictions over base models. Our best meta-models achieve comparable performance to state-of-the-art deep learning tools exclusively based on structures, while allowing for improved database scalability and flexibility through the explicit inclusion of features such as physicochemical properties or molecular descriptors. Overall, we demonstrate that diverse modeling approaches can be ensembled together to gain improvement in binding affinity prediction.

Related papers

Detecting and Pruning Prominent but Detrimental Neurons in Large Language Models [68.57424628540907]
Large language models (LLMs) often develop learned mechanisms specialized to specific datasets.<n>We introduce a fine-tuning approach designed to enhance generalization by identifying and pruning neurons associated with dataset-specific mechanisms.<n>Our method employs Integrated Gradients to quantify each neuron's influence on high-confidence predictions, pinpointing those that disproportionately contribute to dataset-specific performance.
arXiv Detail & Related papers (2025-07-12T08:10:10Z)
Relative Overfitting and Accept-Reject Framework [5.465098504510676]
We propose an ensemble framework that governs how models are segmented to ensure performance improvement.<n>We detail the patterns of this framework within the domain of NLP and briefly describe its to other fields, such as computer vision (CV) and AI for science.
arXiv Detail & Related papers (2025-05-12T17:36:14Z)
Enhancing Time Series Forecasting via a Parallel Hybridization of ARIMA and Polynomial Classifiers [1.9799527196428246]
We propose a hybrid forecasting approach that integrates the ARIMA model with a classifier to leverage the complementary strengths of both models.<n>The proposed hybrid model consistently outperforms the individual models in terms of prediction accuracy, al-beit with a modest increase in execution time.
arXiv Detail & Related papers (2025-05-11T06:53:19Z)
A Collaborative Ensemble Framework for CTR Prediction [73.59868761656317]
We propose a novel framework, Collaborative Ensemble Training Network (CETNet), to leverage multiple distinct models. Unlike naive model scaling, our approach emphasizes diversity and collaboration through collaborative learning. We validate our framework on three public datasets and a large-scale industrial dataset from Meta.
arXiv Detail & Related papers (2024-11-20T20:38:56Z)
Integrating Large Language Models for Genetic Variant Classification [12.244115429231888]
Large Language Models (LLMs) have emerged as transformative tools in genetics. This study investigates the integration of state-of-the-art LLMs, including GPN-MSA, ESM1b, and AlphaMissense. Our approach evaluates these integrated models using the well-annotated ProteinGym and ClinVar datasets.
arXiv Detail & Related papers (2024-11-07T13:45:56Z)
Dynamic Post-Hoc Neural Ensemblers [55.15643209328513]
In this study, we explore employing neural networks as ensemble methods. Motivated by the risk of learning low-diversity ensembles, we propose regularizing the model by randomly dropping base model predictions. We demonstrate this approach lower bounds the diversity within the ensemble, reducing overfitting and improving generalization capabilities.
arXiv Detail & Related papers (2024-10-06T15:25:39Z)
PerturBench: Benchmarking Machine Learning Models for Cellular Perturbation Analysis [14.526536510805755]
We present a comprehensive framework for predicting the effects of perturbations in single cells, designed to standardize benchmarking in this rapidly evolving field. Our framework, PerturBench, includes a user-friendly platform, diverse datasets, metrics for fair model comparison, and detailed performance analysis.
arXiv Detail & Related papers (2024-08-20T07:40:20Z)
Has Your Pretrained Model Improved? A Multi-head Posterior Based Approach [25.927323251675386]
We leverage the meta-features associated with each entity as a source of worldly knowledge and employ entity representations from the models. We propose using the consistency between these representations and the meta-features as a metric for evaluating pre-trained models. Our method's effectiveness is demonstrated across various domains, including models with relational datasets, large language models and image models.
arXiv Detail & Related papers (2024-01-02T17:08:26Z)
On the Generalization and Adaption Performance of Causal Models [99.64022680811281]
Differentiable causal discovery has proposed to factorize the data generating process into a set of modules. We study the generalization and adaption performance of such modular neural causal models. Our analysis shows that the modular neural causal models outperform other models on both zero and few-shot adaptation in low data regimes.
arXiv Detail & Related papers (2022-06-09T17:12:32Z)
Model-agnostic multi-objective approach for the evolutionary discovery of mathematical models [55.41644538483948]
In modern data science, it is more interesting to understand the properties of the model, which parts could be replaced to obtain better results. We use multi-objective evolutionary optimization for composite data-driven model learning to obtain the algorithm's desired properties.
arXiv Detail & Related papers (2021-07-07T11:17:09Z)
Improving the Reconstruction of Disentangled Representation Learners via Multi-Stage Modeling [54.94763543386523]
Current autoencoder-based disentangled representation learning methods achieve disentanglement by penalizing the ( aggregate) posterior to encourage statistical independence of the latent factors. We present a novel multi-stage modeling approach where the disentangled factors are first learned using a penalty-based disentangled representation learning method. Then, the low-quality reconstruction is improved with another deep generative model that is trained to model the missing correlated latent variables.
arXiv Detail & Related papers (2020-10-25T18:51:15Z)
Control as Hybrid Inference [62.997667081978825]
We present an implementation of CHI which naturally mediates the balance between iterative and amortised inference. We verify the scalability of our algorithm on a continuous control benchmark, demonstrating that it outperforms strong model-free and model-based baselines.
arXiv Detail & Related papers (2020-07-11T19:44:09Z)
Improved Protein-ligand Binding Affinity Prediction with Structure-Based Deep Fusion Inference [3.761791311908692]
Predicting accurate protein-ligand binding affinity is important in drug discovery. Recent advances in the deep convolutional and graph neural network based approaches, the model performance depends on the input data representation. We present fusion models to benefit from different feature representations of two neural network models to improve the binding affinity prediction.
arXiv Detail & Related papers (2020-05-17T22:26:27Z)

This list is automatically generated from the titles and abstracts of the papers in this site.