A Brief Review of Machine Learning Techniques for Protein
Phosphorylation Sites Prediction
- URL: http://arxiv.org/abs/2108.04951v1
- Date: Tue, 10 Aug 2021 22:23:30 GMT
- Title: A Brief Review of Machine Learning Techniques for Protein
Phosphorylation Sites Prediction
- Authors: Farzaneh Esmaili, Mahdi Pourmirzaei, Shahin Ramazi, Elham Yavari
- Abstract summary: Reversible Post-Translational Modifications (PTMs) have vital roles in extending the functional diversity of proteins.
PTMs have happened as crucial molecular regulatory mechanisms that are utilized to regulate diverse cellular processes.
Disorder in this modification can be caused by multiple diseases including neurological disorders and cancers.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Reversible Post-Translational Modifications (PTMs) have vital roles in
extending the functional diversity of proteins and effect meaningfully the
regulation of protein functions in prokaryotic and eukaryotic organisms. PTMs
have happened as crucial molecular regulatory mechanisms that are utilized to
regulate diverse cellular processes. Nevertheless, among the most well-studied
PTMs can say mainly types of proteins are containing phosphorylation and
significant roles in many biological processes. Disorder in this modification
can be caused by multiple diseases including neurological disorders and
cancers. Therefore, it is necessary to predict the phosphorylation of target
residues in an uncharacterized amino acid sequence. Most experimental
techniques for predicting phosphorylation are time-consuming, costly, and
error-prone. By the way, computational methods have replaced these techniques.
These days, a vast amount of phosphorylation data is publicly accessible
through many online databases. In this study, at first, all datasets of PTMs
that include phosphorylation sites (p-sites) were comprehensively reviewed.
Furthermore, we showed that there are basically two main approaches for
phosphorylation prediction by machine learning: End-to-End and conventional. We
gave an overview for both of them. Also, we introduced 15 important feature
extraction techniques which mostly have been used for conventional machine
learning methods
Related papers
- Computational Protein Science in the Era of Large Language Models (LLMs) [54.35488233989787]
Computational protein science is dedicated to revealing knowledge and developing applications within the protein sequence-structure-function paradigm.
Recently, Language Models (pLMs) have emerged as a milestone in AI due to their unprecedented language processing & generalization capability.
arXiv Detail & Related papers (2025-01-17T16:21:18Z) - Multi-modal Representation Learning Enables Accurate Protein Function Prediction in Low-Data Setting [0.0]
HOPER (HOlistic ProtEin Representation) is a novel framework designed to enhance protein function prediction (PFP) in low-data settings.
Our results highlight the effectiveness of multimodal representation learning for overcoming data limitations in biological research.
arXiv Detail & Related papers (2024-11-22T20:13:55Z) - MeToken: Uniform Micro-environment Token Boosts Post-Translational Modification Prediction [65.33218256339151]
Post-translational modifications (PTMs) profoundly expand the complexity and functionality of the proteome.
Existing computational approaches predominantly focus on protein sequences to predict PTM sites, driven by the recognition of sequence-dependent motifs.
We introduce the MeToken model, which tokenizes the micro-environment of each acid, integrating both sequence and structural information into unified discrete tokens.
arXiv Detail & Related papers (2024-11-04T07:14:28Z) - NovoBench: Benchmarking Deep Learning-based De Novo Peptide Sequencing Methods in Proteomics [58.03989832372747]
We present the first unified benchmark NovoBench for emphde novo peptide sequencing.
It comprises diverse mass spectrum data, integrated models, and comprehensive evaluation metrics.
Recent methods, including DeepNovo, PointNovo, Casanovo, InstaNovo, AdaNovo and $pi$-HelixNovo are integrated into our framework.
arXiv Detail & Related papers (2024-06-16T08:23:21Z) - ProLLM: Protein Chain-of-Thoughts Enhanced LLM for Protein-Protein Interaction Prediction [54.132290875513405]
The prediction of protein-protein interactions (PPIs) is crucial for understanding biological functions and diseases.
Previous machine learning approaches to PPI prediction mainly focus on direct physical interactions.
We propose a novel framework ProLLM that employs an LLM tailored for PPI for the first time.
arXiv Detail & Related papers (2024-03-30T05:32:42Z) - MAPE-PPI: Towards Effective and Efficient Protein-Protein Interaction
Prediction via Microenvironment-Aware Protein Embedding [82.31506767274841]
Protein-Protein Interactions (PPIs) are fundamental in various biological processes and play a key role in life activities.
MPAE-PPI encodes microenvironments into chemically meaningful discrete codes via a sufficiently large microenvironment "vocabulary"
MPAE-PPI can scale to PPI prediction with millions of PPIs with superior trade-offs between effectiveness and computational efficiency.
arXiv Detail & Related papers (2024-02-22T09:04:41Z) - Efficiently Predicting Protein Stability Changes Upon Single-point
Mutation with Large Language Models [51.57843608615827]
The ability to precisely predict protein thermostability is pivotal for various subfields and applications in biochemistry.
We introduce an ESM-assisted efficient approach that integrates protein sequence and structural features to predict the thermostability changes in protein upon single-point mutations.
arXiv Detail & Related papers (2023-12-07T03:25:49Z) - PTransIPs: Identification of phosphorylation sites enhanced by protein
PLM embeddings [2.971764950146918]
We develop PTransIPs, a new deep learning framework for the identification of phosphorylation sites.
PTransIPs outperforms existing state-of-the-art (SOTA) methods, achieving AUCs of 0.9232 and 0.9660.
arXiv Detail & Related papers (2023-08-08T07:50:38Z) - AmorProt: Amino Acid Molecular Fingerprints Repurposing based Protein
Fingerprint [0.0]
We propose the amino acid molecular fingerprints repurposing based protein (AmorProt) fingerprint.
The performances of the tree based machine learning and artificial neural network models were compared.
The results revealed that the current protein representation method can be applied to various fields related to proteins.
arXiv Detail & Related papers (2023-03-27T23:57:47Z) - Deep Learning Methods for Protein Family Classification on PDB
Sequencing Data [0.0]
We demonstrate and compare the performance of several deep learning frameworks, including novel bi-directional LSTM and convolutional models, on widely available sequencing data.
Our results show that our deep learning models deliver superior performance to classical machine learning methods, with the convolutional architecture providing the most impressive inference performance.
arXiv Detail & Related papers (2022-07-14T06:11:32Z) - Using Genetic Programming to Predict and Optimize Protein Function [65.25258357832584]
We propose POET, a computational Genetic Programming tool based on evolutionary methods to enhance screening and mutagenesis in Directed Evolution.
As a proof-of-concept we use peptides that generate MRI contrast detected by the Chemical Exchange Saturation Transfer mechanism.
Our results indicate that a computational modelling tool like POET can help to find peptides with 400% better functionality than used before.
arXiv Detail & Related papers (2022-02-08T18:08:08Z) - Multimodal Pre-Training Model for Sequence-based Prediction of
Protein-Protein Interaction [7.022012579173686]
Pre-training a protein model to learn effective representation is critical for protein-protein interactions.
Most pre-training models for PPIs are sequence-based, which naively adopt the language models used in natural language processing to amino acid sequences.
We propose a multimodal protein pre-training model with three modalities: sequence, structure, and function.
arXiv Detail & Related papers (2021-12-09T10:21:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.