Related papers: Predicting protein stability changes under multiple amino acid substitutions using equivariant graph neural networks

Predicting protein stability changes under multiple amino acid substitutions using equivariant graph neural networks

URL: http://arxiv.org/abs/2305.19801v1
Date: Tue, 30 May 2023 14:48:06 GMT
Title: Predicting protein stability changes under multiple amino acid substitutions using equivariant graph neural networks
Authors: Sebastien Boyer, Sam Money-Kyrle, Oliver Bent
Abstract summary: We propose improvements to state-of-the-art Deep learning (DL) protein stability prediction models. This was achieved using E(3)-equivariant graph neural networks (EGNNs) for both atomic environment (AE) embedding and residue-level scoring tasks. We demonstrate the immediately promising results of this procedure, discuss the current shortcomings, and highlight potential future strategies.
Score: 2.5137859989323537
License: http://creativecommons.org/licenses/by/4.0/
Abstract: The accurate prediction of changes in protein stability under multiple amino acid substitutions is essential for realising true in-silico protein re-design. To this purpose, we propose improvements to state-of-the-art Deep learning (DL) protein stability prediction models, enabling first-of-a-kind predictions for variable numbers of amino acid substitutions, on structural representations, by decoupling the atomic and residue scales of protein representations. This was achieved using E(3)-equivariant graph neural networks (EGNNs) for both atomic environment (AE) embedding and residue-level scoring tasks. Our AE embedder was used to featurise a residue-level graph, then trained to score mutant stability ($\Delta\Delta G$). To achieve effective training of this predictive EGNN we have leveraged the unprecedented scale of a new high-throughput protein stability experimental data-set, Mega-scale. Finally, we demonstrate the immediately promising results of this procedure, discuss the current shortcomings, and highlight potential future strategies.

Related papers

Mass Balance Approximation of Unfolding Improves Potential-Like Methods for Protein Stability Predictions [0.0]
Deep-learning strategies have pushed the field forward, but their use in standard methods remains limited due to resource demands. This study shows that incorporating a mass-balance correction (MBC) to account for the unfolded state significantly enhances these methods. While many machine learning models partially model this balance, our analysis suggests that a refined representation of the unfolded state may improve the predictive performance.
arXiv Detail & Related papers (2025-04-09T11:53:02Z)
AlgoRxplorers | Precision in Mutation: Enhancing Drug Design with Advanced Protein Stability Prediction Tools [0.6749750044497732]
Predicting the impact of single-point amino acid mutations on protein stability is essential for understanding disease mechanisms and advancing drug development. Protein stability, quantified by changes in Gibbs free energy ($DeltaDelta G$), is influenced by these mutations. This study proposes the application of deep neural networks, leveraging transfer learning and fusing complementary information from different models, to create a feature-rich representation of the protein stability landscape.
arXiv Detail & Related papers (2025-01-13T02:17:01Z)
Leveraging Multi-modal Representations to Predict Protein Melting Temperatures [4.105077436212467]
We develop models based on powerful protein language models, including ESM-2, ESM-3 and AlphaFold. We obtain a new state-of-the-art performance on the s571 test dataset, obtaining a Pearson correlation coefficient (PCC) of 0.50.
arXiv Detail & Related papers (2024-12-05T16:03:09Z)
SFM-Protein: Integrative Co-evolutionary Pre-training for Advanced Protein Sequence Representation [97.99658944212675]
We introduce a novel pre-training strategy for protein foundation models. It emphasizes the interactions among amino acid residues to enhance the extraction of both short-range and long-range co-evolutionary features. Trained on a large-scale protein sequence dataset, our model demonstrates superior generalization ability.
arXiv Detail & Related papers (2024-10-31T15:22:03Z)
HERMES: Holographic Equivariant neuRal network model for Mutational Effect and Stability prediction [0.0]
HERMES is a 3D rotationally equivariant structure-based neural network model for mutational effect and stability prediction. We present a suite of HERMES models, pre-trained with different strategies, and fine-tuned to predict the stability effect of mutations.
arXiv Detail & Related papers (2024-07-09T09:31:05Z)
Protein binding affinity prediction under multiple substitutions applying eGNNs on Residue and Atomic graphs combined with Language model information: eGRAL [1.840390797252648]
Deep learning is increasingly recognized as a powerful tool capable of bridging the gap between in-silico predictions and in-vitro observations. We propose eGRAL, a novel graph neural network architecture designed for predicting binding affinity changes from amino acid substitutions in protein complexes. eGRAL leverages residue, atomic and evolutionary scales, thanks to features extracted from protein large language models.
arXiv Detail & Related papers (2024-05-03T10:33:19Z)
Beyond ESM2: Graph-Enhanced Protein Sequence Modeling with Efficient Clustering [24.415612744612773]
Proteins are essential to life's processes, underpinning evolution and diversity. Advances in sequencing technology have revealed millions of proteins, underscoring the need for sophisticated pre-trained protein models for biological analysis and AI development. Facebook's ESM2, the most advanced protein language model to date, leverages a masked prediction task for unsupervised learning, crafting amino acid representations with notable biochemical accuracy. Yet, it lacks in delivering functional protein insights, signaling an opportunity for enhancing representation quality. This study addresses this gap by incorporating protein family classification into ESM2's training, while a contextual prediction task fine-tunes local
arXiv Detail & Related papers (2024-04-24T11:09:43Z)
Efficiently Predicting Mutational Effect on Homologous Proteins by Evolution Encoding [7.067145619709089]
EvolMPNN is an efficient model to learn evolution-aware protein embeddings. Our model shows up to 6.4% better than state-of-the-art methods and attains 36X inference speedup.
arXiv Detail & Related papers (2024-02-20T23:06:21Z)
Efficiently Predicting Protein Stability Changes Upon Single-point Mutation with Large Language Models [51.57843608615827]
The ability to precisely predict protein thermostability is pivotal for various subfields and applications in biochemistry. We introduce an ESM-assisted efficient approach that integrates protein sequence and structural features to predict the thermostability changes in protein upon single-point mutations.
arXiv Detail & Related papers (2023-12-07T03:25:49Z)
Reprogramming Pretrained Language Models for Protein Sequence Representation Learning [68.75392232599654]
We propose Representation Learning via Dictionary Learning (R2DL), an end-to-end representation learning framework. R2DL reprograms a pretrained English language model to learn the embeddings of protein sequences. Our model can attain better accuracy and significantly improve the data efficiency by up to $105$ times over the baselines set by pretrained and standard supervised methods.
arXiv Detail & Related papers (2023-01-05T15:55:18Z)
State-specific protein-ligand complex structure prediction with a multi-scale deep generative model [68.28309982199902]
We present NeuralPLexer, a computational approach that can directly predict protein-ligand complex structures. Our study suggests that a data-driven approach can capture the structural cooperativity between proteins and small molecules, showing promise in accelerating the design of enzymes, drug molecules, and beyond.
arXiv Detail & Related papers (2022-09-30T01:46:38Z)
Learning Geometrically Disentangled Representations of Protein Folding Simulations [72.03095377508856]
This work focuses on learning a generative neural network on a structural ensemble of a drug-target protein. Model tasks involve characterizing the distinct structural fluctuations of the protein bound to various drug molecules. Results show that our geometric learning-based method enjoys both accuracy and efficiency for generating complex structural variations.
arXiv Detail & Related papers (2022-05-20T19:38:00Z)
EBM-Fold: Fully-Differentiable Protein Folding Powered by Energy-based Models [53.17320541056843]
We propose a fully-differentiable approach for protein structure optimization, guided by a data-driven generative network. Our EBM-Fold approach can efficiently produce high-quality decoys, compared against traditional Rosetta-based structure optimization routines.
arXiv Detail & Related papers (2021-05-11T03:40:29Z)

This list is automatically generated from the titles and abstracts of the papers in this site.