Cross-Modality Protein Embedding for Compound-Protein Affinity and
Contact Prediction
- URL: http://arxiv.org/abs/2012.00651v1
- Date: Sat, 14 Nov 2020 04:42:25 GMT
- Title: Cross-Modality Protein Embedding for Compound-Protein Affinity and
Contact Prediction
- Authors: Yuning You, Yang Shen
- Abstract summary: We consider proteins as multi-modal data including 1D amino-acid sequences and (sequence-predicted) 2D residue-pair contact maps.
We empirically evaluate the embeddings of the two single modalities in their accuracy and generalizability of CPAC prediction.
- Score: 15.955668586941472
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Compound-protein pairs dominate FDA-approved drug-target pairs and the
prediction of compound-protein affinity and contact (CPAC) could help
accelerate drug discovery. In this study we consider proteins as multi-modal
data including 1D amino-acid sequences and (sequence-predicted) 2D residue-pair
contact maps. We empirically evaluate the embeddings of the two single
modalities in their accuracy and generalizability of CPAC prediction (i.e.
structure-free interpretable compound-protein affinity prediction). And we
rationalize their performances in both challenges of embedding individual
modalities and learning generalizable embedding-label relationship. We further
propose two models involving cross-modality protein embedding and establish
that the one with cross interaction (thus capturing correlations among
modalities) outperforms SOTAs and our single modality models in affinity,
contact, and binding-site predictions for proteins never seen in the training
set.
Related papers
- Accurate Prediction of Ligand-Protein Interaction Affinities with Fine-Tuned Small Language Models [0.0]
We describe the accurate prediction of ligand-protein interaction (LPI) affinities with instruction fine-tuned pretrained generative small language models (SLMs)
Our results demonstrate a clear improvement over machine learning (ML) and free-energy perturbation (FEP+) based methods in accurately predicting a range of LPI affinities.
arXiv Detail & Related papers (2024-06-27T13:04:58Z) - ProLLM: Protein Chain-of-Thoughts Enhanced LLM for Protein-Protein Interaction Prediction [54.132290875513405]
The prediction of protein-protein interactions (PPIs) is crucial for understanding biological functions and diseases.
Previous machine learning approaches to PPI prediction mainly focus on direct physical interactions.
We propose a novel framework ProLLM that employs an LLM tailored for PPI for the first time.
arXiv Detail & Related papers (2024-03-30T05:32:42Z) - PSC-CPI: Multi-Scale Protein Sequence-Structure Contrasting for
Efficient and Generalizable Compound-Protein Interaction Prediction [63.50967073653953]
Compound-Protein Interaction prediction aims to predict the pattern and strength of compound-protein interactions for rational drug discovery.
Existing deep learning-based methods utilize only the single modality of protein sequences or structures.
We propose a novel multi-scale Protein Sequence-structure Contrasting framework for CPI prediction.
arXiv Detail & Related papers (2024-02-13T03:51:10Z) - Efficiently Predicting Protein Stability Changes Upon Single-point
Mutation with Large Language Models [51.57843608615827]
The ability to precisely predict protein thermostability is pivotal for various subfields and applications in biochemistry.
We introduce an ESM-assisted efficient approach that integrates protein sequence and structural features to predict the thermostability changes in protein upon single-point mutations.
arXiv Detail & Related papers (2023-12-07T03:25:49Z) - Protein-ligand binding representation learning from fine-grained
interactions [29.965890962846093]
We propose to learn protein-ligand binding representation in a self-supervised learning manner.
This self-supervised learning problem is formulated as a prediction of the conclusive binding complex structure.
Experiments have demonstrated the superiority of our method across various binding tasks.
arXiv Detail & Related papers (2023-11-09T01:33:09Z) - Improving Protein-peptide Interface Predictions in the Low Data Regime [0.0]
We propose a novel approach for predicting protein-peptide interactions using a bi-modal transformer architecture.
We show that the distributions of inter-facial residue-residue interactions share overlap with inter residue-residue interactions.
This dataaugmentation allows us to leverage the vast amount of protein-only data available in the PepBDB to train neural networks.
arXiv Detail & Related papers (2023-05-31T17:04:27Z) - State-specific protein-ligand complex structure prediction with a
multi-scale deep generative model [68.28309982199902]
We present NeuralPLexer, a computational approach that can directly predict protein-ligand complex structures.
Our study suggests that a data-driven approach can capture the structural cooperativity between proteins and small molecules, showing promise in accelerating the design of enzymes, drug molecules, and beyond.
arXiv Detail & Related papers (2022-09-30T01:46:38Z) - Learning Geometrically Disentangled Representations of Protein Folding
Simulations [72.03095377508856]
This work focuses on learning a generative neural network on a structural ensemble of a drug-target protein.
Model tasks involve characterizing the distinct structural fluctuations of the protein bound to various drug molecules.
Results show that our geometric learning-based method enjoys both accuracy and efficiency for generating complex structural variations.
arXiv Detail & Related papers (2022-05-20T19:38:00Z) - Explainable Deep Relational Networks for Predicting Compound-Protein
Affinities and Contacts [80.69440684790925]
DeepRelations is a physics-inspired deep relational network with intrinsically explainable architecture.
It shows superior interpretability to the state-of-the-art.
It boosts the AUPRC of contact prediction 9.5, 16.9, 19.3 and 5.7-fold for the test, compound-unique, protein-unique, and both-unique sets.
arXiv Detail & Related papers (2019-12-29T00:14:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.