Sequence-guided protein structure determination using graph
convolutional and recurrent networks
- URL: http://arxiv.org/abs/2007.06847v3
- Date: Thu, 3 Sep 2020 02:25:28 GMT
- Title: Sequence-guided protein structure determination using graph
convolutional and recurrent networks
- Authors: Po-Nan Li and Saulo H. P. de Oliveira and Soichi Wakatsuki and Henry
van den Bedem
- Abstract summary: Single particle, cryogenic electron microscopy (cryo-EM) experiments now routinely produce high-resolution data for large proteins.
Existing protocols for this type of task often rely on significant human intervention and can take hours to many days to produce an output.
Here, we present a fully automated, template-free model building approach that is based entirely on neural networks.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Single particle, cryogenic electron microscopy (cryo-EM) experiments now
routinely produce high-resolution data for large proteins and their complexes.
Building an atomic model into a cryo-EM density map is challenging,
particularly when no structure for the target protein is known a priori.
Existing protocols for this type of task often rely on significant human
intervention and can take hours to many days to produce an output. Here, we
present a fully automated, template-free model building approach that is based
entirely on neural networks. We use a graph convolutional network (GCN) to
generate an embedding from a set of rotamer-based amino acid identities and
candidate 3-dimensional C$\alpha$ locations. Starting from this embedding, we
use a bidirectional long short-term memory (LSTM) module to order and label the
candidate identities and atomic locations consistent with the input protein
sequence to obtain a structural model. Our approach paves the way for
determining protein structures from cryo-EM densities at a fraction of the time
of existing approaches and without the need for human intervention.
Related papers
- SFM-Protein: Integrative Co-evolutionary Pre-training for Advanced Protein Sequence Representation [97.99658944212675]
We introduce a novel pre-training strategy for protein foundation models.
It emphasizes the interactions among amino acid residues to enhance the extraction of both short-range and long-range co-evolutionary features.
Trained on a large-scale protein sequence dataset, our model demonstrates superior generalization ability.
arXiv Detail & Related papers (2024-10-31T15:22:03Z) - Target-aware Variational Auto-encoders for Ligand Generation with
Multimodal Protein Representation Learning [2.01243755755303]
We introduce TargetVAE, a target-aware auto-encoder that generates with high binding affinities to arbitrary protein targets.
This is the first effort to unify different representations of proteins into a single model that we name as Protein Multimodal Network (PMN)
arXiv Detail & Related papers (2023-08-02T12:08:17Z) - ModelAngelo: Automated Model Building in Cryo-EM Maps [1.2891210250935146]
We build ModelAngelo for automated model building of proteins in cryo-EM maps.
Recent advances in machine learning applications to protein structure prediction show potential for automating this process.
ModelAngelo outperforms the state-of-the-art and approximates manual building for cryo-EM maps with resolutions better than 3.5 rA.
arXiv Detail & Related papers (2022-09-30T16:47:45Z) - State-specific protein-ligand complex structure prediction with a
multi-scale deep generative model [68.28309982199902]
We present NeuralPLexer, a computational approach that can directly predict protein-ligand complex structures.
Our study suggests that a data-driven approach can capture the structural cooperativity between proteins and small molecules, showing promise in accelerating the design of enzymes, drug molecules, and beyond.
arXiv Detail & Related papers (2022-09-30T01:46:38Z) - HelixFold-Single: MSA-free Protein Structure Prediction by Using Protein
Language Model as an Alternative [61.984700682903096]
HelixFold-Single is proposed to combine a large-scale protein language model with the superior geometric learning capability of AlphaFold2.
Our proposed method pre-trains a large-scale protein language model with thousands of millions of primary sequences.
We obtain an end-to-end differentiable model to predict the 3D coordinates of atoms from only the primary sequence.
arXiv Detail & Related papers (2022-07-28T07:30:33Z) - Three-dimensional microstructure generation using generative adversarial
neural networks in the context of continuum micromechanics [77.34726150561087]
This work proposes a generative adversarial network tailored towards three-dimensional microstructure generation.
The lightweight algorithm is able to learn the underlying properties of the material from a single microCT-scan without the need of explicit descriptors.
arXiv Detail & Related papers (2022-05-31T13:26:51Z) - Protein Structure and Sequence Generation with Equivariant Denoising
Diffusion Probabilistic Models [3.5450828190071646]
An important task in bioengineering is designing proteins with specific 3D structures and chemical properties which enable targeted functions.
We introduce a generative model of both protein structure and sequence that can operate at significantly larger scales than previous molecular generative modeling approaches.
arXiv Detail & Related papers (2022-05-26T16:10:09Z) - Learning Geometrically Disentangled Representations of Protein Folding
Simulations [72.03095377508856]
This work focuses on learning a generative neural network on a structural ensemble of a drug-target protein.
Model tasks involve characterizing the distinct structural fluctuations of the protein bound to various drug molecules.
Results show that our geometric learning-based method enjoys both accuracy and efficiency for generating complex structural variations.
arXiv Detail & Related papers (2022-05-20T19:38:00Z) - EBM-Fold: Fully-Differentiable Protein Folding Powered by Energy-based
Models [53.17320541056843]
We propose a fully-differentiable approach for protein structure optimization, guided by a data-driven generative network.
Our EBM-Fold approach can efficiently produce high-quality decoys, compared against traditional Rosetta-based structure optimization routines.
arXiv Detail & Related papers (2021-05-11T03:40:29Z) - Functional Protein Structure Annotation Using a Deep Convolutional
Generative Adversarial Network [4.3871352596331255]
We introduce the use of a Deep Convolutional Generative Adversarial Network (DCGAN) to classify protein structures based on their functionality.
We train DCGAN on 3-dimensional (3D) decoy and native protein structures in order to generate and discriminate 3D protein structures.
arXiv Detail & Related papers (2021-04-18T22:18:52Z) - Protein model quality assessment using rotation-equivariant,
hierarchical neural networks [8.373439916313018]
We present a novel deep learning approach to assess the quality of a protein model.
Our method achieves state-of-the-art results in scoring protein models submitted to recent rounds of CASP.
arXiv Detail & Related papers (2020-11-27T05:03:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.