CryoBench: Diverse and challenging datasets for the heterogeneity problem in cryo-EM
- URL: http://arxiv.org/abs/2408.05526v2
- Date: Thu, 16 Jan 2025 00:54:04 GMT
- Title: CryoBench: Diverse and challenging datasets for the heterogeneity problem in cryo-EM
- Authors: Minkyu Jeon, Rishwanth Raghu, Miro Astore, Geoffrey Woollard, Ryan Feathers, Alkin Kaz, Sonya M. Hanson, Pilar Cossio, Ellen D. Zhong,
- Abstract summary: Cryo-electron microscopy (cryo-EM) is a powerful technique for determining high-resolution 3D biomolecular structures from imaging data.<n>Here, we introduce CryoBench, a suite of datasets, metrics, and benchmarks for heterogeneous reconstruction in cryo-EM.
- Score: 3.424647356090208
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Cryo-electron microscopy (cryo-EM) is a powerful technique for determining high-resolution 3D biomolecular structures from imaging data. Its unique ability to capture structural variability has spurred the development of heterogeneous reconstruction algorithms that can infer distributions of 3D structures from noisy, unlabeled imaging data. Despite the growing number of advanced methods, progress in the field is hindered by the lack of standardized benchmarks with ground truth information and reliable validation metrics. Here, we introduce CryoBench, a suite of datasets, metrics, and benchmarks for heterogeneous reconstruction in cryo-EM. CryoBench includes five datasets representing different sources of heterogeneity and degrees of difficulty. These include conformational heterogeneity generated from designed motions of antibody complexes or sampled from a molecular dynamics simulation, as well as compositional heterogeneity from mixtures of ribosome assembly states or 100 common complexes present in cells. We then analyze state-of-the-art heterogeneous reconstruction tools, including neural and non-neural methods, assess their sensitivity to noise, and propose new metrics for quantitative evaluation. We hope that CryoBench will be a foundational resource for accelerating algorithmic development and evaluation in the cryo-EM and machine learning communities. Project page: https://cryobench.cs.princeton.edu.
Related papers
- Structure Language Models for Protein Conformation Generation [66.42864253026053]
Traditional physics-based simulation methods often struggle with sampling equilibrium conformations.
Deep generative models have shown promise in generating protein conformations as a more efficient alternative.
We introduce Structure Language Modeling as a novel framework for efficient protein conformation generation.
arXiv Detail & Related papers (2024-10-24T03:38:51Z) - CryoFM: A Flow-based Foundation Model for Cryo-EM Densities [50.291974465864364]
We present CryoFM, a foundation model designed as a generative model, learning the distribution of high-quality density maps.
Built on flow matching, CryoFM is trained to accurately capture the prior distribution of biomolecular density maps.
arXiv Detail & Related papers (2024-10-11T08:53:58Z) - CryoChains: Heterogeneous Reconstruction of Molecular Assembly of
Semi-flexible Chains from Cryo-EM Images [3.0828074702828623]
We propose CryoChains that encodes large deformations of biomolecules via rigid body transformation of their chains.
Our data experiments on the human GABAtextsubscriptB and heat shock protein show that CryoChains gives a biophysically-grounded quantification of the heterogeneous conformations of biomolecules.
arXiv Detail & Related papers (2023-06-12T17:57:12Z) - Optimizations of Autoencoders for Analysis and Classification of
Microscopic In Situ Hybridization Images [68.8204255655161]
We propose a deep-learning framework to detect and classify areas of microscopic images with similar levels of gene expression.
The data we analyze requires an unsupervised learning model for which we employ a type of Artificial Neural Network - Deep Learning Autoencoders.
arXiv Detail & Related papers (2023-04-19T13:45:28Z) - CryoFormer: Continuous Heterogeneous Cryo-EM Reconstruction using
Transformer-based Neural Representations [49.49939711956354]
Cryo-electron microscopy (cryo-EM) allows for the high-resolution reconstruction of 3D structures of proteins and other biomolecules.
It is still challenging to reconstruct the continuous motions of 3D structures from noisy and randomly oriented 2D cryo-EM images.
We propose CryoFormer, a new approach for continuous heterogeneous cryo-EM reconstruction.
arXiv Detail & Related papers (2023-03-28T18:59:17Z) - Latent Space Diffusion Models of Cryo-EM Structures [6.968705314671148]
We train a diffusion model as an expressive, learnable prior in the cryoDRGN framework.
By learning an accurate model of the data distribution, our method unlocks tools in generative modeling, sampling, and distribution analysis.
arXiv Detail & Related papers (2022-11-25T15:17:10Z) - Amortized Inference for Heterogeneous Reconstruction in Cryo-EM [36.911133113707045]
cryo-electron microscopy (cryo-EM) provides insights into the dynamics of proteins and other building blocks of life.
The algorithmic challenge of jointly estimating the poses, 3D structure, and conformational heterogeneity of a biomolecule remains unsolved.
Our method, cryoFIRE, performs ab initio heterogeneous reconstruction with unknown poses in an amortized framework.
We show that our method can provide one order of magnitude speedup on datasets containing millions of images without any loss of accuracy.
arXiv Detail & Related papers (2022-10-13T22:06:38Z) - Heterogeneous reconstruction of deformable atomic models in Cryo-EM [30.864688165021054]
We describe a heterogeneous reconstruction method based on an atomistic representation whose deformation is reduced to a handful of collective motions.
We show for each distribution that our approach is able to recapitulate the intermediate atomic models with atomic-level accuracy.
arXiv Detail & Related papers (2022-09-29T22:35:35Z) - Three-dimensional microstructure generation using generative adversarial
neural networks in the context of continuum micromechanics [77.34726150561087]
This work proposes a generative adversarial network tailored towards three-dimensional microstructure generation.
The lightweight algorithm is able to learn the underlying properties of the material from a single microCT-scan without the need of explicit descriptors.
arXiv Detail & Related papers (2022-05-31T13:26:51Z) - Learning Geometrically Disentangled Representations of Protein Folding
Simulations [72.03095377508856]
This work focuses on learning a generative neural network on a structural ensemble of a drug-target protein.
Model tasks involve characterizing the distinct structural fluctuations of the protein bound to various drug molecules.
Results show that our geometric learning-based method enjoys both accuracy and efficiency for generating complex structural variations.
arXiv Detail & Related papers (2022-05-20T19:38:00Z) - SHREC 2021: Classification in cryo-electron tomograms [13.443446070180562]
cryo-electron tomography (cryo-ET) is an imaging technique that allows three-dimensional visualization of macro-molecular assemblies.
Cryo-ET comes with a number of challenges, mainly low signal-to-noise and inability to obtain images from all angles.
We generate a novel simulated dataset to benchmark different methods of localization and classification of biological macromolecules in tomograms.
arXiv Detail & Related papers (2022-03-18T16:08:22Z) - CryoAI: Amortized Inference of Poses for Ab Initio Reconstruction of 3D
Molecular Volumes from Real Cryo-EM Images [30.738209997049395]
We introduce cryoAI, an ab initio reconstruction algorithm for homogeneous conformations that uses gradient-based optimization of particle poses and the electron scattering potential from single-particle cryo-EM data.
CryoAI achieves results on par with state-of-the-art cryo-EM solvers for both simulated and experimental data, one order of magnitude faster for large datasets and with significantly lower memory requirements than existing methods.
arXiv Detail & Related papers (2022-03-15T17:58:03Z) - A deep learning driven pseudospectral PCE based FFT homogenization
algorithm for complex microstructures [68.8204255655161]
It is shown that the proposed method is able to predict central moments of interest while being magnitudes faster to evaluate than traditional approaches.
It is shown, that the proposed method is able to predict central moments of interest while being magnitudes faster to evaluate than traditional approaches.
arXiv Detail & Related papers (2021-10-26T07:02:14Z) - Deep learning based mixed-dimensional GMM for characterizing variability
in CryoEM [0.0]
CryoEM provides direct visualization of individual macromolecules in different conformational and compositional states.
We present a machine learning algorithm to determine a conformational landscape for proteins or complexes.
We demonstrate this method on several different biomolecular systems to explore compositional and conformational changes at a range of scales.
arXiv Detail & Related papers (2021-01-25T19:05:23Z) - Learning Mixtures of Low-Rank Models [89.39877968115833]
We study the problem of learning computational mixtures of low-rank models.
We develop an algorithm that is guaranteed to recover the unknown matrices with near-optimal sample.
In addition, the proposed algorithm is provably stable against random noise.
arXiv Detail & Related papers (2020-09-23T17:53:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.