PANDA: Predicting the change in proteins binding affinity upon mutations
using sequence information
- URL: http://arxiv.org/abs/2009.08869v1
- Date: Wed, 16 Sep 2020 17:12:25 GMT
- Title: PANDA: Predicting the change in proteins binding affinity upon mutations
using sequence information
- Authors: Wajid Arshad Abbasi, Syed Ali Abbas, Saiqa Andleeb
- Abstract summary: Determination of change in binding affinity upon mutations requires sophisticated, expensive, and time-consuming wet-lab experiments.
Most of the computational prediction techniques require protein structures that limit their applicability to protein complexes with known structures.
We have used protein sequence information instead of protein structures along with machine learning techniques to accurately predict the change in protein binding affinity upon mutation.
- Score: 0.3867363075280544
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Accurately determining a change in protein binding affinity upon mutations is
important for the discovery and design of novel therapeutics and to assist
mutagenesis studies. Determination of change in binding affinity upon mutations
requires sophisticated, expensive, and time-consuming wet-lab experiments that
can be aided with computational methods. Most of the computational prediction
techniques require protein structures that limit their applicability to protein
complexes with known structures. In this work, we explore the sequence-based
prediction of change in protein binding affinity upon mutation. We have used
protein sequence information instead of protein structures along with machine
learning techniques to accurately predict the change in protein binding
affinity upon mutation. Our proposed sequence-based novel change in protein
binding affinity predictor called PANDA gives better accuracy than existing
methods over the same validation set as well as on an external independent test
dataset. On an external test dataset, our proposed method gives a maximum
Pearson correlation coefficient of 0.52 in comparison to the state-of-the-art
existing protein structure-based method called MutaBind which gives a maximum
Pearson correlation coefficient of 0.59. Our proposed protein sequence-based
method, to predict a change in binding affinity upon mutations, has wide
applicability and comparable performance in comparison to existing protein
structure-based methods. A cloud-based webserver implementation of PANDA and
its python code is available at
https://sites.google.com/view/wajidarshad/software and
https://github.com/wajidarshad/panda.
Related papers
- CoPRA: Bridging Cross-domain Pretrained Sequence Models with Complex Structures for Protein-RNA Binding Affinity Prediction [23.1499716310298]
We build the largest protein-RNA binding affinity dataset PRA310 for performance evaluation.
We provide extensive analyses and verify that CoPRA can (1) accurately predict the protein-RNA binding affinity; (2) understand the binding affinity change caused by mutations; and (3) benefit from scaling data and model size.
arXiv Detail & Related papers (2024-08-21T09:48:22Z) - Learning to Predict Mutation Effects of Protein-Protein Interactions by Microenvironment-aware Hierarchical Prompt Learning [78.38442423223832]
We develop a novel codebook pre-training task, namely masked microenvironment modeling.
We demonstrate superior performance and training efficiency over state-of-the-art pre-training-based methods in mutation effect prediction.
arXiv Detail & Related papers (2024-05-16T03:53:21Z) - Protein binding affinity prediction under multiple substitutions applying eGNNs on Residue and Atomic graphs combined with Language model information: eGRAL [1.840390797252648]
Deep learning is increasingly recognized as a powerful tool capable of bridging the gap between in-silico predictions and in-vitro observations.
We propose eGRAL, a novel graph neural network architecture designed for predicting binding affinity changes from amino acid substitutions in protein complexes.
eGRAL leverages residue, atomic and evolutionary scales, thanks to features extracted from protein large language models.
arXiv Detail & Related papers (2024-05-03T10:33:19Z) - ProLLM: Protein Chain-of-Thoughts Enhanced LLM for Protein-Protein Interaction Prediction [54.132290875513405]
The prediction of protein-protein interactions (PPIs) is crucial for understanding biological functions and diseases.
Previous machine learning approaches to PPI prediction mainly focus on direct physical interactions.
We propose a novel framework ProLLM that employs an LLM tailored for PPI for the first time.
arXiv Detail & Related papers (2024-03-30T05:32:42Z) - NaNa and MiGu: Semantic Data Augmentation Techniques to Enhance Protein Classification in Graph Neural Networks [60.48306899271866]
We propose novel semantic data augmentation methods to incorporate backbone chemical and side-chain biophysical information into protein classification tasks.
Specifically, we leverage molecular biophysical, secondary structure, chemical bonds, andionic features of proteins to facilitate classification tasks.
arXiv Detail & Related papers (2024-03-21T13:27:57Z) - Efficiently Predicting Mutational Effect on Homologous Proteins by Evolution Encoding [7.067145619709089]
EvolMPNN is an efficient model to learn evolution-aware protein embeddings.
Our model shows up to 6.4% better than state-of-the-art methods and attains 36X inference speedup.
arXiv Detail & Related papers (2024-02-20T23:06:21Z) - PSC-CPI: Multi-Scale Protein Sequence-Structure Contrasting for
Efficient and Generalizable Compound-Protein Interaction Prediction [63.50967073653953]
Compound-Protein Interaction prediction aims to predict the pattern and strength of compound-protein interactions for rational drug discovery.
Existing deep learning-based methods utilize only the single modality of protein sequences or structures.
We propose a novel multi-scale Protein Sequence-structure Contrasting framework for CPI prediction.
arXiv Detail & Related papers (2024-02-13T03:51:10Z) - Efficiently Predicting Protein Stability Changes Upon Single-point
Mutation with Large Language Models [51.57843608615827]
The ability to precisely predict protein thermostability is pivotal for various subfields and applications in biochemistry.
We introduce an ESM-assisted efficient approach that integrates protein sequence and structural features to predict the thermostability changes in protein upon single-point mutations.
arXiv Detail & Related papers (2023-12-07T03:25:49Z) - Pairing interacting protein sequences using masked language modeling [0.3222802562733787]
We develop a method to pair interacting protein sequences using protein language models trained on sequence alignments.
We exploit the ability of MSA Transformer to fill in masked amino acids in multiple sequence alignments using the surrounding context.
We show that it captures inter-chain coevolution while it was trained on single-chain data, which means that it can be used out-of-distribution.
arXiv Detail & Related papers (2023-08-14T13:42:09Z) - Multi-level Protein Representation Learning for Blind Mutational Effect
Prediction [5.207307163958806]
This paper introduces a novel pre-training framework that cascades sequential and geometric analyzers for protein structures.
It guides mutational directions toward desired traits by simulating natural selection on wild-type proteins.
We assess the proposed approach using a public database and two new databases for a variety of variant effect prediction tasks.
arXiv Detail & Related papers (2023-06-08T03:00:50Z) - State-specific protein-ligand complex structure prediction with a
multi-scale deep generative model [68.28309982199902]
We present NeuralPLexer, a computational approach that can directly predict protein-ligand complex structures.
Our study suggests that a data-driven approach can capture the structural cooperativity between proteins and small molecules, showing promise in accelerating the design of enzymes, drug molecules, and beyond.
arXiv Detail & Related papers (2022-09-30T01:46:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.