End-to-End Optimized Pipeline for Prediction of Protein Folding Kinetics
- URL: http://arxiv.org/abs/2309.09191v1
- Date: Sun, 17 Sep 2023 07:35:54 GMT
- Title: End-to-End Optimized Pipeline for Prediction of Protein Folding Kinetics
- Authors: Vijay Arvind.R and Haribharathi Sivakumar and Brindha.R
- Abstract summary: This research proposes an efficient pipeline for predicting protein folding kinetics with high accuracy and low memory footprint.
The deployed machine learning (ML) model outperformed the state-of-the-art ML models by 4.8% in terms of accuracy while consuming 327x lesser memory and being 7.3% faster.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Protein folding is the intricate process by which a linear sequence of amino
acids self-assembles into a unique three-dimensional structure. Protein folding
kinetics is the study of pathways and time-dependent mechanisms a protein
undergoes when it folds. Understanding protein kinetics is essential as a
protein needs to fold correctly for it to perform its biological functions
optimally, and a misfolded protein can sometimes be contorted into shapes that
are not ideal for a cellular environment giving rise to many degenerative,
neuro-degenerative disorders and amyloid diseases. Monitoring at-risk
individuals and detecting protein discrepancies in a protein's folding kinetics
at the early stages could majorly result in public health benefits, as
preventive measures can be taken. This research proposes an efficient pipeline
for predicting protein folding kinetics with high accuracy and low memory
footprint. The deployed machine learning (ML) model outperformed the
state-of-the-art ML models by 4.8% in terms of accuracy while consuming 327x
lesser memory and being 7.3% faster.
Related papers
- Long-context Protein Language Model [76.95505296417866]
Self-supervised training of language models (LMs) has seen great success for protein sequences in learning meaningful representations and for generative drug design.
Most protein LMs are based on the Transformer architecture trained on individual proteins with short context lengths.
We propose LC-PLM based on an alternative protein LM architecture, BiMamba-S, built off selective structured state-space models.
We also introduce its graph-contextual variant, LC-PLM-G, which contextualizes protein-protein interaction graphs for a second stage of training.
arXiv Detail & Related papers (2024-10-29T16:43:28Z) - Geometric Self-Supervised Pretraining on 3D Protein Structures using Subgraphs [26.727436310732692]
We propose a novel self-supervised method to pretrain 3D graph neural networks on 3D protein structures.
We experimentally show that our proposed pertaining strategy leads to significant improvements up to 6%.
arXiv Detail & Related papers (2024-06-20T09:34:31Z) - ProLLM: Protein Chain-of-Thoughts Enhanced LLM for Protein-Protein Interaction Prediction [54.132290875513405]
The prediction of protein-protein interactions (PPIs) is crucial for understanding biological functions and diseases.
Previous machine learning approaches to PPI prediction mainly focus on direct physical interactions.
We propose a novel framework ProLLM that employs an LLM tailored for PPI for the first time.
arXiv Detail & Related papers (2024-03-30T05:32:42Z) - Protein Conformation Generation via Force-Guided SE(3) Diffusion Models [48.48934625235448]
Deep generative modeling techniques have been employed to generate novel protein conformations.
We propose a force-guided SE(3) diffusion model, ConfDiff, for protein conformation generation.
arXiv Detail & Related papers (2024-03-21T02:44:08Z) - Efficiently Predicting Mutational Effect on Homologous Proteins by Evolution Encoding [7.067145619709089]
EvolMPNN is an efficient model to learn evolution-aware protein embeddings.
Our model shows up to 6.4% better than state-of-the-art methods and attains 36X inference speedup.
arXiv Detail & Related papers (2024-02-20T23:06:21Z) - An approach to solve the coarse-grained Protein folding problem in a
Quantum Computer [0.0]
Understanding protein structures and enzymes plays a critical role in target based drug designing, elucidating protein-related disease mechanisms, and innovating novel enzymes.
Recent advancements in AI based protein structure prediction methods have solved the protein folding problem to an extent, but their precision in determining the structure of the protein with low sequence similarity is limited.
In this work we developed a novel turn based encoding algorithm that can be run on a gate based quantum computer for predicting the structure of smaller protein sequences.
arXiv Detail & Related papers (2023-11-23T18:20:05Z) - Multi-level Protein Representation Learning for Blind Mutational Effect
Prediction [5.207307163958806]
This paper introduces a novel pre-training framework that cascades sequential and geometric analyzers for protein structures.
It guides mutational directions toward desired traits by simulating natural selection on wild-type proteins.
We assess the proposed approach using a public database and two new databases for a variety of variant effect prediction tasks.
arXiv Detail & Related papers (2023-06-08T03:00:50Z) - A Latent Diffusion Model for Protein Structure Generation [50.74232632854264]
We propose a latent diffusion model that can reduce the complexity of protein modeling.
We show that our method can effectively generate novel protein backbone structures with high designability and efficiency.
arXiv Detail & Related papers (2023-05-06T19:10:19Z) - Structure-informed Language Models Are Protein Designers [69.70134899296912]
We present LM-Design, a generic approach to reprogramming sequence-based protein language models (pLMs)
We conduct a structural surgery on pLMs, where a lightweight structural adapter is implanted into pLMs and endows it with structural awareness.
Experiments show that our approach outperforms the state-of-the-art methods by a large margin.
arXiv Detail & Related papers (2023-02-03T10:49:52Z) - Learning Geometrically Disentangled Representations of Protein Folding
Simulations [72.03095377508856]
This work focuses on learning a generative neural network on a structural ensemble of a drug-target protein.
Model tasks involve characterizing the distinct structural fluctuations of the protein bound to various drug molecules.
Results show that our geometric learning-based method enjoys both accuracy and efficiency for generating complex structural variations.
arXiv Detail & Related papers (2022-05-20T19:38:00Z) - Leveraging Sequence Embedding and Convolutional Neural Network for
Protein Function Prediction [27.212743275697825]
Main challenges of protein function prediction are the large label space and the lack of labeled training data.
Our method leverages unsupervised sequence embedding and the success of deep convolutional neural network to overcome these challenges.
arXiv Detail & Related papers (2021-12-01T08:31:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.