HERMES: Holographic Equivariant neuRal network model for Mutational Effect and Stability prediction
- URL: http://arxiv.org/abs/2407.06703v1
- Date: Tue, 9 Jul 2024 09:31:05 GMT
- Title: HERMES: Holographic Equivariant neuRal network model for Mutational Effect and Stability prediction
- Authors: Gian Marco Visani, Michael N. Pun, William Galvin, Eric Daniel, Kevin Borisiak, Utheri Wagura, Armita Nourmohammad,
- Abstract summary: HERMES is a 3D rotationally equivariant structure-based neural network model for mutational effect and stability prediction.
We present a suite of HERMES models, pre-trained with different strategies, and fine-tuned to predict the stability effect of mutations.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Predicting the stability and fitness effects of amino acid mutations in proteins is a cornerstone of biological discovery and engineering. Various experimental techniques have been developed to measure mutational effects, providing us with extensive datasets across a diverse range of proteins. By training on these data, traditional computational modeling and more recent machine learning approaches have advanced significantly in predicting mutational effects. Here, we introduce HERMES, a 3D rotationally equivariant structure-based neural network model for mutational effect and stability prediction. Pre-trained to predict amino acid propensity from its surrounding 3D structure, HERMES can be fine-tuned for mutational effects using our open-source code. We present a suite of HERMES models, pre-trained with different strategies, and fine-tuned to predict the stability effect of mutations. Benchmarking against other models shows that HERMES often outperforms or matches their performance in predicting mutational effect on stability, binding, and fitness. HERMES offers versatile tools for evaluating mutational effects and can be fine-tuned for specific predictive objectives.
Related papers
- SFM-Protein: Integrative Co-evolutionary Pre-training for Advanced Protein Sequence Representation [97.99658944212675]
We introduce a novel pre-training strategy for protein foundation models.
It emphasizes the interactions among amino acid residues to enhance the extraction of both short-range and long-range co-evolutionary features.
Trained on a large-scale protein sequence dataset, our model demonstrates superior generalization ability.
arXiv Detail & Related papers (2024-10-31T15:22:03Z) - Retrieval-Enhanced Mutation Mastery: Augmenting Zero-Shot Prediction of Protein Language Model [3.4494754789770186]
Deep learning methods for protein modeling have demonstrated superior results at lower costs compared to traditional approaches.
In mutation effect prediction, the key to pre-training deep learning models lies in accurately interpreting the complex relationships among protein sequence, structure, and function.
This study introduces a retrieval-enhanced protein language model for comprehensive analysis of native properties from sequence and local structural interactions.
arXiv Detail & Related papers (2024-10-28T15:28:51Z) - Learning to Predict Mutation Effects of Protein-Protein Interactions by Microenvironment-aware Hierarchical Prompt Learning [78.38442423223832]
We develop a novel codebook pre-training task, namely masked microenvironment modeling.
We demonstrate superior performance and training efficiency over state-of-the-art pre-training-based methods in mutation effect prediction.
arXiv Detail & Related papers (2024-05-16T03:53:21Z) - Protein binding affinity prediction under multiple substitutions applying eGNNs on Residue and Atomic graphs combined with Language model information: eGRAL [1.840390797252648]
Deep learning is increasingly recognized as a powerful tool capable of bridging the gap between in-silico predictions and in-vitro observations.
We propose eGRAL, a novel graph neural network architecture designed for predicting binding affinity changes from amino acid substitutions in protein complexes.
eGRAL leverages residue, atomic and evolutionary scales, thanks to features extracted from protein large language models.
arXiv Detail & Related papers (2024-05-03T10:33:19Z) - Efficiently Predicting Protein Stability Changes Upon Single-point
Mutation with Large Language Models [51.57843608615827]
The ability to precisely predict protein thermostability is pivotal for various subfields and applications in biochemistry.
We introduce an ESM-assisted efficient approach that integrates protein sequence and structural features to predict the thermostability changes in protein upon single-point mutations.
arXiv Detail & Related papers (2023-12-07T03:25:49Z) - The effect of data augmentation and 3D-CNN depth on Alzheimer's Disease
detection [51.697248252191265]
This work summarizes and strictly observes best practices regarding data handling, experimental design, and model evaluation.
We focus on Alzheimer's Disease (AD) detection, which serves as a paradigmatic example of challenging problem in healthcare.
Within this framework, we train predictive 15 models, considering three different data augmentation strategies and five distinct 3D CNN architectures.
arXiv Detail & Related papers (2023-09-13T10:40:41Z) - Fast and Functional Structured Data Generators Rooted in
Out-of-Equilibrium Physics [62.997667081978825]
We address the challenge of using energy-based models to produce high-quality, label-specific data in structured datasets.
Traditional training methods encounter difficulties due to inefficient Markov chain Monte Carlo mixing.
We use a novel training algorithm that exploits non-equilibrium effects.
arXiv Detail & Related papers (2023-07-13T15:08:44Z) - Accurate and Definite Mutational Effect Prediction with Lightweight
Equivariant Graph Neural Networks [2.381587712372268]
This research introduces a lightweight graph representation learning scheme that efficiently analyzes the microenvironment of wild-type proteins.
Our solution offers a wide range of benefits that make it an ideal choice for the community.
arXiv Detail & Related papers (2023-04-13T09:51:49Z) - Protein language model rescue mutations highlight variant effects and
structure in clinically relevant genes [1.7970523486905976]
We interrogate the use of protein language models in characterizing known pathogenic mutations in curated, medically actionable genes.
Systematic analysis of the predicted effects of these compensatory mutations reveal unappreciated structural features of proteins.
We encourage the community to generate and curate rescue mutation experiments to inform the design of more sophisticated co-masking strategies.
arXiv Detail & Related papers (2022-11-18T03:00:52Z) - Using Genetic Programming to Predict and Optimize Protein Function [65.25258357832584]
We propose POET, a computational Genetic Programming tool based on evolutionary methods to enhance screening and mutagenesis in Directed Evolution.
As a proof-of-concept we use peptides that generate MRI contrast detected by the Chemical Exchange Saturation Transfer mechanism.
Our results indicate that a computational modelling tool like POET can help to find peptides with 400% better functionality than used before.
arXiv Detail & Related papers (2022-02-08T18:08:08Z) - SPLDExtraTrees: Robust machine learning approach for predicting kinase
inhibitor resistance [1.0674604700001966]
We propose a robust machine learning method, SPLDExtraTrees, which can accurately predict ligand binding affinity changes upon protein mutation.
The proposed method ranks training data following a specific scheme that starts with easy-to-learn samples.
Experiments substantiate the capability of the proposed method for predicting kinase inhibitor resistance under three scenarios.
arXiv Detail & Related papers (2021-11-15T09:07:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.