ModelAngelo: Automated Model Building in Cryo-EM Maps
- URL: http://arxiv.org/abs/2210.00006v1
- Date: Fri, 30 Sep 2022 16:47:45 GMT
- Title: ModelAngelo: Automated Model Building in Cryo-EM Maps
- Authors: Kiarash Jamali, Dari Kimanius and Sjors Scheres
- Abstract summary: We build ModelAngelo for automated model building of proteins in cryo-EM maps.
Recent advances in machine learning applications to protein structure prediction show potential for automating this process.
ModelAngelo outperforms the state-of-the-art and approximates manual building for cryo-EM maps with resolutions better than 3.5 rA.
- Score: 1.2891210250935146
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Electron cryo-microscopy (cryo-EM) produces three-dimensional (3D) maps of
the electrostatic potential of biological macromolecules, including proteins.
At sufficient resolution, the cryo-EM maps, along with some knowledge about the
imaged molecules, allow de novo atomic modelling. Typically, this is done
through a laborious manual process. Recent advances in machine learning
applications to protein structure prediction show potential for automating this
process. Taking inspiration from these techniques, we have built ModelAngelo
for automated model building of proteins in cryo-EM maps. ModelAngelo first
uses a residual convolutional neural network (CNN) to initialize a graph
representation with nodes assigned to individual amino acids of the proteins in
the map and edges representing the protein chain. The graph is then refined
with a graph neural network (GNN) that combines the cryo-EM data, the amino
acid sequence data and prior knowledge about protein geometries. The GNN
refines the geometry of the protein chain and classifies the amino acids for
each of its nodes. The final graph is post-processed with a hidden Markov model
(HMM) search to map each protein chain to entries in a user provided sequence
file. Application to 28 test cases shows that ModelAngelo outperforms the
state-of-the-art and approximates manual building for cryo-EM maps with
resolutions better than 3.5 \r{A}.
Related papers
- Pre-trained Molecular Language Models with Random Functional Group Masking [54.900360309677794]
We propose a SMILES-based underlineem Molecular underlineem Language underlineem Model, which randomly masking SMILES subsequences corresponding to specific molecular atoms.
This technique aims to compel the model to better infer molecular structures and properties, thus enhancing its predictive capabilities.
arXiv Detail & Related papers (2024-11-03T01:56:15Z) - Ranking protein-protein models with large language models and graph neural networks [49.1574468325115]
DeepRank-GNN-esm is a graph-based deep learning algorithm for ranking modelled PPI structures.
Here, we detail the use of our software with examples.
arXiv Detail & Related papers (2024-07-23T10:51:35Z) - Geometric Self-Supervised Pretraining on 3D Protein Structures using Subgraphs [26.727436310732692]
We propose a novel self-supervised method to pretrain 3D graph neural networks on 3D protein structures.
We experimentally show that our proposed pertaining strategy leads to significant improvements up to 6%.
arXiv Detail & Related papers (2024-06-20T09:34:31Z) - Target-aware Variational Auto-encoders for Ligand Generation with
Multimodal Protein Representation Learning [2.01243755755303]
We introduce TargetVAE, a target-aware auto-encoder that generates with high binding affinities to arbitrary protein targets.
This is the first effort to unify different representations of proteins into a single model that we name as Protein Multimodal Network (PMN)
arXiv Detail & Related papers (2023-08-02T12:08:17Z) - EquiPocket: an E(3)-Equivariant Geometric Graph Neural Network for Ligand Binding Site Prediction [49.674494450107005]
Predicting the binding sites of target proteins plays a fundamental role in drug discovery.
Most existing deep-learning methods consider a protein as a 3D image by spatially clustering its atoms into voxels.
This work proposes EquiPocket, an E(3)-equivariant Graph Neural Network (GNN) for binding site prediction.
arXiv Detail & Related papers (2023-02-23T17:18:26Z) - 3D Reconstruction of Protein Complex Structures Using Synthesized
Multi-View AFM Images [9.91587631689811]
We train a neural network for 3D reconstruction called Pix2Vox++ using the synthesized multi-view 2D AFM images dataset.
We compare the predicted structure obtained using a different number of views and get the intersection over union (IoU) value of 0.92 on the training dataset and 0.52 on the validation dataset.
arXiv Detail & Related papers (2022-11-26T20:50:34Z) - Structure-aware Protein Self-supervised Learning [50.04673179816619]
We propose a novel structure-aware protein self-supervised learning method to capture structural information of proteins.
In particular, a well-designed graph neural network (GNN) model is pretrained to preserve the protein structural information.
We identify the relation between the sequential information in the protein language model and the structural information in the specially designed GNN model via a novel pseudo bi-level optimization scheme.
arXiv Detail & Related papers (2022-04-06T02:18:41Z) - Sequence-guided protein structure determination using graph
convolutional and recurrent networks [0.0]
Single particle, cryogenic electron microscopy (cryo-EM) experiments now routinely produce high-resolution data for large proteins.
Existing protocols for this type of task often rely on significant human intervention and can take hours to many days to produce an output.
Here, we present a fully automated, template-free model building approach that is based entirely on neural networks.
arXiv Detail & Related papers (2020-07-14T06:24:07Z) - Neural Cellular Automata Manifold [84.08170531451006]
We show that the neural network architecture of the Neural Cellular Automata can be encapsulated in a larger NN.
This allows us to propose a new model that encodes a manifold of NCA, each of them capable of generating a distinct image.
In biological terms, our approach would play the role of the transcription factors, modulating the mapping of genes into specific proteins that drive cellular differentiation.
arXiv Detail & Related papers (2020-06-22T11:41:57Z) - Uncovering the Folding Landscape of RNA Secondary Structure with Deep
Graph Embeddings [71.20283285671461]
We propose a geometric scattering autoencoder (GSAE) network for learning such graph embeddings.
Our embedding network first extracts rich graph features using the recently proposed geometric scattering transform.
We show that GSAE organizes RNA graphs both by structure and energy, accurately reflecting bistable RNA structures.
arXiv Detail & Related papers (2020-06-12T00:17:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.