Improving Protein-peptide Interface Predictions in the Low Data Regime
- URL: http://arxiv.org/abs/2306.00557v1
- Date: Wed, 31 May 2023 17:04:27 GMT
- Title: Improving Protein-peptide Interface Predictions in the Low Data Regime
- Authors: Justin Diamond, Markus Lill
- Abstract summary: We propose a novel approach for predicting protein-peptide interactions using a bi-modal transformer architecture.
We show that the distributions of inter-facial residue-residue interactions share overlap with inter residue-residue interactions.
This dataaugmentation allows us to leverage the vast amount of protein-only data available in the PepBDB to train neural networks.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We propose a novel approach for predicting protein-peptide interactions using
a bi-modal transformer architecture that learns an inter-facial joint
distribution of residual contacts. The current data sets for crystallized
protein-peptide complexes are limited, making it difficult to accurately
predict interactions between proteins and peptides. To address this issue, we
propose augmenting the existing data from PepBDB with pseudo protein-peptide
complexes derived from the PDB. The augmented data set acts as a method to
transfer physics-based contextdependent intra-residue (within a domain)
interactions to the inter-residual (between) domains. We show that the
distributions of inter-facial residue-residue interactions share overlap with
inter residue-residue interactions, enough to increase predictive power of our
bi-modal transformer architecture. In addition, this dataaugmentation allows us
to leverage the vast amount of protein-only data available in the PDB to train
neural networks, in contrast to template-based modeling that acts as a prior
Related papers
- SFM-Protein: Integrative Co-evolutionary Pre-training for Advanced Protein Sequence Representation [97.99658944212675]
We introduce a novel pre-training strategy for protein foundation models.
It emphasizes the interactions among amino acid residues to enhance the extraction of both short-range and long-range co-evolutionary features.
Trained on a large-scale protein sequence dataset, our model demonstrates superior generalization ability.
arXiv Detail & Related papers (2024-10-31T15:22:03Z) - ProLLM: Protein Chain-of-Thoughts Enhanced LLM for Protein-Protein Interaction Prediction [54.132290875513405]
The prediction of protein-protein interactions (PPIs) is crucial for understanding biological functions and diseases.
Previous machine learning approaches to PPI prediction mainly focus on direct physical interactions.
We propose a novel framework ProLLM that employs an LLM tailored for PPI for the first time.
arXiv Detail & Related papers (2024-03-30T05:32:42Z) - PSC-CPI: Multi-Scale Protein Sequence-Structure Contrasting for
Efficient and Generalizable Compound-Protein Interaction Prediction [63.50967073653953]
Compound-Protein Interaction prediction aims to predict the pattern and strength of compound-protein interactions for rational drug discovery.
Existing deep learning-based methods utilize only the single modality of protein sequences or structures.
We propose a novel multi-scale Protein Sequence-structure Contrasting framework for CPI prediction.
arXiv Detail & Related papers (2024-02-13T03:51:10Z) - Effective Protein-Protein Interaction Exploration with PPIretrieval [46.07027715907749]
We propose PPIretrieval, the first deep learning-based model for protein-protein interaction exploration.
PPIretrieval searches for potential PPIs in an embedding space, capturing rich geometric and chemical information of protein surfaces.
arXiv Detail & Related papers (2024-02-06T03:57:06Z) - Protein-ligand binding representation learning from fine-grained
interactions [29.965890962846093]
We propose to learn protein-ligand binding representation in a self-supervised learning manner.
This self-supervised learning problem is formulated as a prediction of the conclusive binding complex structure.
Experiments have demonstrated the superiority of our method across various binding tasks.
arXiv Detail & Related papers (2023-11-09T01:33:09Z) - Improved K-mer Based Prediction of Protein-Protein Interactions With
Chaos Game Representation, Deep Learning and Reduced Representation Bias [0.0]
We present a method for extracting unique pairs from an interaction dataset, generating non-redundant paired data for unbiased machine learning.
We develop a convolutional neural network model capable of learning and predicting interactions from Chaos Game Representations of proteins' coding genes.
arXiv Detail & Related papers (2023-10-23T10:02:23Z) - State-specific protein-ligand complex structure prediction with a
multi-scale deep generative model [68.28309982199902]
We present NeuralPLexer, a computational approach that can directly predict protein-ligand complex structures.
Our study suggests that a data-driven approach can capture the structural cooperativity between proteins and small molecules, showing promise in accelerating the design of enzymes, drug molecules, and beyond.
arXiv Detail & Related papers (2022-09-30T01:46:38Z) - Geometric Transformers for Protein Interface Contact Prediction [3.031630445636656]
We present the Geometric Transformer, a novel geometry-evolving graph transformer for rotation and translation-invariant protein interface contact prediction.
DeepInteract predicts partner-specific protein interface contacts given the 3D tertiary structures of two proteins as input.
arXiv Detail & Related papers (2021-10-06T00:12:15Z) - Structure-aware Interactive Graph Neural Networks for the Prediction of
Protein-Ligand Binding Affinity [52.67037774136973]
Drug discovery often relies on the successful prediction of protein-ligand binding affinity.
Recent advances have shown great promise in applying graph neural networks (GNNs) for better affinity prediction by learning the representations of protein-ligand complexes.
We propose a structure-aware interactive graph neural network (SIGN) which consists of two components: polar-inspired graph attention layers (PGAL) and pairwise interactive pooling (PiPool)
arXiv Detail & Related papers (2021-07-21T03:34:09Z) - Cross-Modality Protein Embedding for Compound-Protein Affinity and
Contact Prediction [15.955668586941472]
We consider proteins as multi-modal data including 1D amino-acid sequences and (sequence-predicted) 2D residue-pair contact maps.
We empirically evaluate the embeddings of the two single modalities in their accuracy and generalizability of CPAC prediction.
arXiv Detail & Related papers (2020-11-14T04:42:25Z) - Deep Learning of High-Order Interactions for Protein Interface
Prediction [58.164371994210406]
We propose to formulate the protein interface prediction as a 2D dense prediction problem.
We represent proteins as graphs and employ graph neural networks to learn node features.
We incorporate high-order pairwise interactions to generate a 3D tensor containing different pairwise interactions.
arXiv Detail & Related papers (2020-07-18T05:39:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.