A Protein Structure Prediction Approach Leveraging Transformer and CNN
Integration
- URL: http://arxiv.org/abs/2402.19095v2
- Date: Fri, 8 Mar 2024 05:30:10 GMT
- Title: A Protein Structure Prediction Approach Leveraging Transformer and CNN
Integration
- Authors: Yanlin Zhou, Kai Tan, Xinyu Shen, Zheng He, Haotian Zheng
- Abstract summary: This paper adopts a two-dimensional fusion deep neural network model, DstruCCN, which uses Convolutional Neural Networks (CCN) and a supervised Transformer protein language model for single-sequence protein structure prediction.
The training features of the two are combined to predict the protein Transformer binding site matrix, and then the three-dimensional structure is reconstructed using energy minimization.
- Score: 4.909112037834705
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Proteins are essential for life, and their structure determines their
function. The protein secondary structure is formed by the folding of the
protein primary structure, and the protein tertiary structure is formed by the
bending and folding of the secondary structure. Therefore, the study of protein
secondary structure is very helpful to the overall understanding of protein
structure. Although the accuracy of protein secondary structure prediction has
continuously improved with the development of machine learning and deep
learning, progress in the field of protein structure prediction, unfortunately,
remains insufficient to meet the large demand for protein information.
Therefore, based on the advantages of deep learning-based methods in feature
extraction and learning ability, this paper adopts a two-dimensional fusion
deep neural network model, DstruCCN, which uses Convolutional Neural Networks
(CCN) and a supervised Transformer protein language model for single-sequence
protein structure prediction. The training features of the two are combined to
predict the protein Transformer binding site matrix, and then the
three-dimensional structure is reconstructed using energy minimization.
Related papers
- CPE-Pro: A Structure-Sensitive Deep Learning Method for Protein Representation and Origin Evaluation [7.161099050722313]
We develop a structure-sensitive supervised deep learning model, Crystal vs Predicted Evaluator for Protein Structure (CPE-Pro)
CPE-Pro learns the structural information of proteins and captures inter-structural differences to achieve accurate traceability on four data classes.
We utilize Foldseek to encode protein structures into "structure-sequences" and trained a protein Structural Sequence Language Model, SSLM.
arXiv Detail & Related papers (2024-10-21T02:21:56Z) - 4D Diffusion for Dynamic Protein Structure Prediction with Reference Guided Motion Alignment [18.90451943620277]
This study introduces an innovative 4D diffusion model incorporating molecular dynamics (MD) simulation data to learn dynamic protein structures.
To our knowledge, this is the first diffusion-based model aimed at predicting protein trajectories across multiple time steps simultaneously.
arXiv Detail & Related papers (2024-08-22T14:12:50Z) - Protein 3D Graph Structure Learning for Robust Structure-based Protein
Property Prediction [43.46012602267272]
Protein structure-based property prediction has emerged as a promising approach for various biological tasks.
Current practices, which simply employ accurately predicted structures during inference, suffer from notable degradation in prediction accuracy.
Our framework is model-agnostic and effective in improving the property prediction of both predicted structures and experimental structures.
arXiv Detail & Related papers (2023-10-14T08:43:42Z) - FFF: Fragments-Guided Flexible Fitting for Building Complete Protein
Structures [10.682516227941592]
We propose a new method named FFF that bridges protein structure prediction and protein structure recognition with flexible fitting.
First, a multi-level recognition network is used to capture various structural features from the input 3D cryo-EM map.
Next, protein structural fragments are generated using pseudo peptide vectors and a protein sequence alignment method based on these extracted features.
arXiv Detail & Related papers (2023-08-07T15:10:21Z) - Structure-informed Language Models Are Protein Designers [69.70134899296912]
We present LM-Design, a generic approach to reprogramming sequence-based protein language models (pLMs)
We conduct a structural surgery on pLMs, where a lightweight structural adapter is implanted into pLMs and endows it with structural awareness.
Experiments show that our approach outperforms the state-of-the-art methods by a large margin.
arXiv Detail & Related papers (2023-02-03T10:49:52Z) - State-specific protein-ligand complex structure prediction with a
multi-scale deep generative model [68.28309982199902]
We present NeuralPLexer, a computational approach that can directly predict protein-ligand complex structures.
Our study suggests that a data-driven approach can capture the structural cooperativity between proteins and small molecules, showing promise in accelerating the design of enzymes, drug molecules, and beyond.
arXiv Detail & Related papers (2022-09-30T01:46:38Z) - Learning Geometrically Disentangled Representations of Protein Folding
Simulations [72.03095377508856]
This work focuses on learning a generative neural network on a structural ensemble of a drug-target protein.
Model tasks involve characterizing the distinct structural fluctuations of the protein bound to various drug molecules.
Results show that our geometric learning-based method enjoys both accuracy and efficiency for generating complex structural variations.
arXiv Detail & Related papers (2022-05-20T19:38:00Z) - Independent SE(3)-Equivariant Models for End-to-End Rigid Protein
Docking [57.2037357017652]
We tackle rigid body protein-protein docking, i.e., computationally predicting the 3D structure of a protein-protein complex from the individual unbound structures.
We design a novel pairwise-independent SE(3)-equivariant graph matching network to predict the rotation and translation to place one of the proteins at the right docked position.
Our model, named EquiDock, approximates the binding pockets and predicts the docking poses using keypoint matching and alignment.
arXiv Detail & Related papers (2021-11-15T18:46:37Z) - EBM-Fold: Fully-Differentiable Protein Folding Powered by Energy-based
Models [53.17320541056843]
We propose a fully-differentiable approach for protein structure optimization, guided by a data-driven generative network.
Our EBM-Fold approach can efficiently produce high-quality decoys, compared against traditional Rosetta-based structure optimization routines.
arXiv Detail & Related papers (2021-05-11T03:40:29Z) - Transfer Learning for Protein Structure Classification at Low Resolution [124.5573289131546]
We show that it is possible to make accurate ($geq$80%) predictions of protein class and architecture from structures determined at low ($leq$3A) resolution.
We provide proof of concept for high-speed, low-cost protein structure classification at low resolution, and a basis for extension to prediction of function.
arXiv Detail & Related papers (2020-08-11T15:01:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.