Predicting mutational effects on protein-protein binding via a
side-chain diffusion probabilistic model
- URL: http://arxiv.org/abs/2310.19849v1
- Date: Mon, 30 Oct 2023 15:23:42 GMT
- Title: Predicting mutational effects on protein-protein binding via a
side-chain diffusion probabilistic model
- Authors: Shiwei Liu, Tian Zhu, Milong Ren, Chungong Yu, Dongbo Bu, Haicang
Zhang
- Abstract summary: We propose SidechainDiff, a representation learning-based approach that leverages unlabelled experimental protein structures.
SidechainDiff is the first diffusion-based generative model for side-chains, distinguishing it from prior efforts that have predominantly focused on generating protein backbone structures.
- Score: 14.949807579474781
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Many crucial biological processes rely on networks of protein-protein
interactions. Predicting the effect of amino acid mutations on protein-protein
binding is vital in protein engineering and therapeutic discovery. However, the
scarcity of annotated experimental data on binding energy poses a significant
challenge for developing computational approaches, particularly deep
learning-based methods. In this work, we propose SidechainDiff, a
representation learning-based approach that leverages unlabelled experimental
protein structures. SidechainDiff utilizes a Riemannian diffusion model to
learn the generative process of side-chain conformations and can also give the
structural context representations of mutations on the protein-protein
interface. Leveraging the learned representations, we achieve state-of-the-art
performance in predicting the mutational effects on protein-protein binding.
Furthermore, SidechainDiff is the first diffusion-based generative model for
side-chains, distinguishing it from prior efforts that have predominantly
focused on generating protein backbone structures.
Related papers
- SFM-Protein: Integrative Co-evolutionary Pre-training for Advanced Protein Sequence Representation [97.99658944212675]
We introduce a novel pre-training strategy for protein foundation models.
It emphasizes the interactions among amino acid residues to enhance the extraction of both short-range and long-range co-evolutionary features.
Trained on a large-scale protein sequence dataset, our model demonstrates superior generalization ability.
arXiv Detail & Related papers (2024-10-31T15:22:03Z) - Loop-Diffusion: an equivariant diffusion model for designing and scoring protein loops [0.0]
Loop-Diffusion is an energy-based diffusion model that learns an energy function that generalizes to functional prediction tasks.
We evaluate Loop-Diffusion's performance on scoring TCR-pMHC interfaces and demonstrate state-of-the-art results in recognizing binding-enhancing mutations.
arXiv Detail & Related papers (2024-09-26T18:34:06Z) - Boosting Protein Language Models with Negative Sample Mining [20.721167029530168]
We introduce a pioneering methodology for boosting large language models in the domain of protein representation learning.
Our primary contribution lies in the refinement process for correlating the over-reliance on co-evolution knowledge.
By capitalizing on this novel approach, our technique steers the training of transformer-based models within the attention score space.
arXiv Detail & Related papers (2024-05-28T07:24:20Z) - ProLLM: Protein Chain-of-Thoughts Enhanced LLM for Protein-Protein Interaction Prediction [54.132290875513405]
The prediction of protein-protein interactions (PPIs) is crucial for understanding biological functions and diseases.
Previous machine learning approaches to PPI prediction mainly focus on direct physical interactions.
We propose a novel framework ProLLM that employs an LLM tailored for PPI for the first time.
arXiv Detail & Related papers (2024-03-30T05:32:42Z) - Protein Conformation Generation via Force-Guided SE(3) Diffusion Models [48.48934625235448]
Deep generative modeling techniques have been employed to generate novel protein conformations.
We propose a force-guided SE(3) diffusion model, ConfDiff, for protein conformation generation.
arXiv Detail & Related papers (2024-03-21T02:44:08Z) - Efficiently Predicting Mutational Effect on Homologous Proteins by Evolution Encoding [7.067145619709089]
EvolMPNN is an efficient model to learn evolution-aware protein embeddings.
Our model shows up to 6.4% better than state-of-the-art methods and attains 36X inference speedup.
arXiv Detail & Related papers (2024-02-20T23:06:21Z) - Multi-level Protein Representation Learning for Blind Mutational Effect
Prediction [5.207307163958806]
This paper introduces a novel pre-training framework that cascades sequential and geometric analyzers for protein structures.
It guides mutational directions toward desired traits by simulating natural selection on wild-type proteins.
We assess the proposed approach using a public database and two new databases for a variety of variant effect prediction tasks.
arXiv Detail & Related papers (2023-06-08T03:00:50Z) - A Latent Diffusion Model for Protein Structure Generation [50.74232632854264]
We propose a latent diffusion model that can reduce the complexity of protein modeling.
We show that our method can effectively generate novel protein backbone structures with high designability and efficiency.
arXiv Detail & Related papers (2023-05-06T19:10:19Z) - Learning Geometrically Disentangled Representations of Protein Folding
Simulations [72.03095377508856]
This work focuses on learning a generative neural network on a structural ensemble of a drug-target protein.
Model tasks involve characterizing the distinct structural fluctuations of the protein bound to various drug molecules.
Results show that our geometric learning-based method enjoys both accuracy and efficiency for generating complex structural variations.
arXiv Detail & Related papers (2022-05-20T19:38:00Z) - Structure-aware Protein Self-supervised Learning [50.04673179816619]
We propose a novel structure-aware protein self-supervised learning method to capture structural information of proteins.
In particular, a well-designed graph neural network (GNN) model is pretrained to preserve the protein structural information.
We identify the relation between the sequential information in the protein language model and the structural information in the specially designed GNN model via a novel pseudo bi-level optimization scheme.
arXiv Detail & Related papers (2022-04-06T02:18:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.