Related papers: Protein 3D structure-based neural networks highly improve the accuracy in compound-protein binding affinity prediction

Protein 3D structure-based neural networks highly improve the accuracy in compound-protein binding affinity prediction

URL: http://arxiv.org/abs/2204.12586v1
Date: Wed, 30 Mar 2022 00:44:15 GMT
Title: Protein 3D structure-based neural networks highly improve the accuracy in compound-protein binding affinity prediction
Authors: Binjie Guo, Hanyu Zheng, Huan Huang, Haohan Jiang, Xiaodan Li, Naiyu Guan, Yanming Zuo, Yicheng Zhang, Hengfu Yang, Xuhua Wang
Abstract summary: We develop Fast Evolutional Attention and Thoroughgoing-graph Neural Networks (FeatNN) to facilitate the application of protein 3D structure information for predicting compound-protein binding affinities (CPAs) FeatNN considerably outperforms various state-of-the-art baselines in CPA prediction with the Pearson value elevated by about 35.7%.
Score: 7.059949221160259
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Theoretically, the accuracy of computational models in predicting compound-protein binding affinities (CPAs) could be improved by the introduction of protein 3D structure information. However, most of these models still suffer from a low accuracy due to the lack of an efficient approach to encode informative protein features. The major challenge is how to combine the multi-modal information such as the residue sequence of the protein, residue atom coordinates and the torsion angles. To tackle this problem, we develop Fast Evolutional Attention and Thoroughgoing-graph Neural Networks (FeatNN) to facilitate the application of protein 3D structure information for predicting CPAs. Specifically, we established a novel end-to-end architecture to jointly embed torsion matrix, discrete distance matrix, and sequence information of protein and extract compound features with deep graph convolution layers. In addition, a new pairwise mapping attention mechanism is introduced to comprehensively learn potential interaction information between proteins and compounds. FeatNN considerably outperforms various state-of-the-art baselines in CPA prediction with the Pearson value elevated by about 35.7%. Thus, FeatNN provides an outstanding method for highly accurate CPA prediction and facilitates high-throughput virtual screening of drug candidates.

Related papers

Bidirectional Hierarchical Protein Multi-Modal Representation Learning [4.682021474006426]
Protein language models (pLMs) pretrained on large scale protein sequences have demonstrated significant success in sequence-based tasks. graph neural networks (GNNs) designed to leverage 3D structural information have shown promising generalization in protein-related prediction tasks. Our framework employs attention and gating mechanisms to enable effective interaction between pLMs-generated sequential representations and GNN-extracted structural features.
arXiv Detail & Related papers (2025-04-07T06:47:49Z)
SE(3)-Equivariant Ternary Complex Prediction Towards Target Protein Degradation [28.648225112411637]
Targeted protein degradation (TPD) induced by small molecules has emerged as a rapidly evolving modality in drug discovery. DeepTernary is a novel deep learning-based approach that directly predicts ternary structures in an end-to-end manner.
arXiv Detail & Related papers (2025-02-26T06:33:24Z)
SFM-Protein: Integrative Co-evolutionary Pre-training for Advanced Protein Sequence Representation [97.99658944212675]
We introduce a novel pre-training strategy for protein foundation models. It emphasizes the interactions among amino acid residues to enhance the extraction of both short-range and long-range co-evolutionary features. Trained on a large-scale protein sequence dataset, our model demonstrates superior generalization ability.
arXiv Detail & Related papers (2024-10-31T15:22:03Z)
Dumpling GNN: Hybrid GNN Enables Better ADC Payload Activity Prediction Based on Chemical Structure [53.76752789814785]
DumplingGNN is a hybrid Graph Neural Network architecture specifically designed for predicting ADC payload activity based on chemical structure. We evaluate it on a comprehensive ADC payload dataset focusing on DNA Topoisomerase I inhibitors. It demonstrates exceptional accuracy (91.48%), sensitivity (95.08%), and specificity (97.54%) on our specialized ADC payload dataset.
arXiv Detail & Related papers (2024-09-23T17:11:04Z)
Protein binding affinity prediction under multiple substitutions applying eGNNs on Residue and Atomic graphs combined with Language model information: eGRAL [1.840390797252648]
Deep learning is increasingly recognized as a powerful tool capable of bridging the gap between in-silico predictions and in-vitro observations. We propose eGRAL, a novel graph neural network architecture designed for predicting binding affinity changes from amino acid substitutions in protein complexes. eGRAL leverages residue, atomic and evolutionary scales, thanks to features extracted from protein large language models.
arXiv Detail & Related papers (2024-05-03T10:33:19Z)
DeepGATGO: A Hierarchical Pretraining-Based Graph-Attention Model for Automatic Protein Function Prediction [4.608328575930055]
Automatic protein function prediction (AFP) is classified as a large-scale multi-label classification problem. Currently, popular methods primarily combine protein-related information and Gene Ontology (GO) terms to generate final functional predictions. We propose a sequence-based hierarchical prediction method, DeepGATGO, which processes protein sequences and GO term labels hierarchically.
arXiv Detail & Related papers (2023-07-24T07:01:32Z)
State-specific protein-ligand complex structure prediction with a multi-scale deep generative model [68.28309982199902]
We present NeuralPLexer, a computational approach that can directly predict protein-ligand complex structures. Our study suggests that a data-driven approach can capture the structural cooperativity between proteins and small molecules, showing promise in accelerating the design of enzymes, drug molecules, and beyond.
arXiv Detail & Related papers (2022-09-30T01:46:38Z)
DProQ: A Gated-Graph Transformer for Protein Complex Structure Assessment [7.988932562855392]
DProQ is a gated neighborhood-modulating Graph Transformer (GGT) designed to predict the quality of 3D protein complex structures. We incorporate node and edge gates within a novel Graph Transformer framework to control information flow during graph message passing. Our rigorous experiments demonstrate that DProQ achieves state-of-the-art performance in ranking protein complex structures.
arXiv Detail & Related papers (2022-05-21T15:41:46Z)
Structure-aware Protein Self-supervised Learning [50.04673179816619]
We propose a novel structure-aware protein self-supervised learning method to capture structural information of proteins. In particular, a well-designed graph neural network (GNN) model is pretrained to preserve the protein structural information. We identify the relation between the sequential information in the protein language model and the structural information in the specially designed GNN model via a novel pseudo bi-level optimization scheme.
arXiv Detail & Related papers (2022-04-06T02:18:41Z)
EBM-Fold: Fully-Differentiable Protein Folding Powered by Energy-based Models [53.17320541056843]
We propose a fully-differentiable approach for protein structure optimization, guided by a data-driven generative network. Our EBM-Fold approach can efficiently produce high-quality decoys, compared against traditional Rosetta-based structure optimization routines.
arXiv Detail & Related papers (2021-05-11T03:40:29Z)
PersGNN: Applying Topological Data Analysis and Geometric Deep Learning to Structure-Based Protein Function Prediction [0.07340017786387766]
In this work, we isolate protein structure to make functional annotations for proteins in the Protein Data Bank. We present PersGNN - an end-to-end trainable deep learning model that combines graph representation learning with topological data analysis.
arXiv Detail & Related papers (2020-10-30T02:24:35Z)
Transfer Learning for Protein Structure Classification at Low Resolution [124.5573289131546]
We show that it is possible to make accurate ($geq$80%) predictions of protein class and architecture from structures determined at low ($leq$3A) resolution. We provide proof of concept for high-speed, low-cost protein structure classification at low resolution, and a basis for extension to prediction of function.
arXiv Detail & Related papers (2020-08-11T15:01:32Z)

This list is automatically generated from the titles and abstracts of the papers in this site.